<aside> ℹ️ Objective: Develop an accelerated means of providing “essentially aligned” Bible translations in the top 10 languages of the Strategic Languages Initiative (Resource Level 4: Arabic (MSA), English, French, Hindi, Indonesian, Mandarin, Portuguese, Russian, Spanish, Swahili).

</aside>

Who is this for?

The primary users of Essentially Aligned Bible Translations are those involved in church-based Bible translation projects who have not had any (or extensive) opportunity to study Greek, Aramaic, or Hebrew. They need an effective means of checking their translations against the essential elements of meaning in the original languages.

What does “Essentially Aligned” mean?

<aside> 🔀 An essentially aligned Bible translation links only the most important lexical items (e.g., nouns and verbs) to the original languages.

</aside>

Why is an “Essentially Aligned” Bible useful?

An essentially aligned Bible translation provides three main values for equipping the church for Bible translation:

  1. Content Indexing: It provides the critical elements of connection to the original languages that makes it possible to connect many other resources (e.g., semantic lexicons, exegetical commentaries, and other foundational Bible study resources).
  2. Deep + Multilingual Study: It enables in depth study of the Scriptures across multiple languages, leading to more trustworthy Bible translations.
  3. Translation Checking: It provides a structured means of checking minority language translations using common language translations with connection to the original languages—which is the essential basis of affirming the trustworthiness of any Bible translation.

Why not fully aligned Bible translations?

Eventually, this is desirable in at least the top 10 world languages. However, the process of fully aligning Bibles is considerably costly—both in terms of time and financial resources for human involvement in aligning every element of every translation to every element of the original languages. Instead of starting with a “100%” solution, we are starting with the “20% of the work that yields 80% of the value” and building from there.

A Faster, Cheaper, More Effective Approach?

<aside> 💡 What if we could create a Machine Learning (ML) / Natural Language Processing (NLP) algorithm that could work from existing datasets to automatically generate an essentially aligned Bible translation in each of the top 10 Strategic Languages?

</aside>

Basic Approach: Using available tokenization/lemmatization/stemming/POS tagger data for these languages, together with existing (or derived) data for Strong's numbers for the nouns and verbs in each Book:Chapter:Verse, write a "Zero language knowledge" tunable algorithm that aligns the Bible translation (ie., connects the Strong's numbers to the correct lemmas).

Proof of Concept Output: Target a 90+% alignment probability average.

Next Steps: Implement a tunable algorithm providing a human-in-the-loop the ability to confirm/finalize alignment, with the tuned algorithm improving the overall alignment.

Design: Architecting a stack that allows for interchangeable NLP tasks such as tokenizing, stemming, and POS tagging would allow incremental improvements at each layer. No assumptions should be made that all data would be available or reliable. Ie. POS tagging, especially for nouns, would be beneficial but not available for all languages. There are different libraries that are used for different languages. Some require rules, lexicons, and/or training and tuning.

Considerations:

Regarding Technology:

Regarding Output:

<aside> ℹ️ Target: Complete “Essentially Aligned Bibles” datasets for all 10 languages by end of 2023 Q2.

</aside>

Hackathon Notes

Project management (Draft)