When words fail: audio recording for verification in multilingual surveys

A survey being conducted in Monguno, Nigeria. Mobile phones and tablets are ubiquitous in humanitarian data collection efforts. Yet most mobile tools do not support continuous audio recording while the survey is being administered. Photo by: Eric DeLuca, Translators without Borders
A survey being conducted in Monguno, Nigeria. Mobile phones and tablets are ubiquitous in humanitarian data collection efforts. Yet most mobile tools do not support continuous audio recording while the survey is being administered. Photo by: Eric DeLuca, Translators without Borders

“Sir, I want to ask you some questions if you agree?”

With that one sentence, our enumerator summarized the 120-word script provided to secure the informed consent of our survey participants – a script designed, in particular, to emphasize that participation would not result in any direct assistance. Humanitarian organizations, research institutes and think tanks around the world are conducting thousands of surveys every year. How many suffer from similar ethical challenges? And how many substandard survey results fall under the radar due to lack of effective quality assurance?

We were conducting a survey on the relationship between internal displacement, cross-border movement, and durable solutions in Borno, a linguistically diverse state in northeast Nigeria. Before data collection began, Translators without Borders (TWB) translated the survey into Hausa and Kanuri to limit the risk of mistranslations due to poor understanding of terminology. Even with this effort, however, not all the enumerators could read Hausa or Kanuri. Although enumerators spent a full day in training going through the translations as a group, there is still a risk that language barriers may have undermined the quality of the research. Humanitarian terminology is often complex, nuanced, and difficult to translate precisely into other languages. A previous study by Translators without Borders in northeastern Nigeria, for example, found that only 57% of enumerators understood the word ‘insurgency’.

We only know the exact phrasing of this interview because we decided to record some of our surveys using an audio recorder. In total, 96 survey interviews were recorded. Fifteen percent of these files were later transcribed into Hausa or Kanuri and translated into English by TWB. Those English transcripts were compared to the enumerator-coded responses, allowing us to analyze the accuracy of our results. While the process was helpful, the findings raise some important concerns.

A digital voice recorder in Maiduguri, Nigeria serves as a simple and low-tech tool for capturing entire surveys. Photo by: Eric DeLuca / Translators without Borders
A digital voice recorder in Maiduguri, Nigeria serves as a simple and low-tech tool for capturing entire surveys. Photo by: Eric DeLuca / Translators without Borders

Consent was not always fully informed

Efforts to obtain informed consent were limited, despite the script provided. According to the consultant, enumerators felt rushed due to the large numbers of people waiting to participate in the survey – but people were interested in participating precisely due to the misbelief that participation could result in assistance, which underlines the need for informed consent. 

Alongside these ethical challenges, the failure to inform participants about the objectives of the research increases the risk of bias in the findings, prompting people to tailor responses to increase their chances of receiving assistance. Problems related to capacity, language, or questionnaire design can also negatively impact survey results, undermining the validity of the findings. 

The enumerator-coded answers did not always match the transcripts

During data quality assurance, we also identified important discrepancies between the interview transcripts and the survey data. In some cases, enumerators had guessed the most likely response rather than properly asking the question, jumping to conclusions based on their understanding of the context rather than respondents’ lived experiences. If the response was unclear, random response options were selected without seeking clarification. Some questions were skipped entirely, but responses still entered into the surveys. The following example, comparing an extract of an interview transcript with the recorded survey data, illustrates these discrepancies. 

Interview transcript Survey data
Interviewer: Do you want to go back to Khaddamari?

Respondent: Yes, I want to.

Interviewer: When do you want to go back?

Respondent: At any time when the peace reigns. You know we are displaced here.

Interviewer: If the place become peaceful, will you go back?

Respondent: If it becomes peaceful, I will go back. 

Do you want to return to Khaddamari in the future? Yes

When do you think you are likely to return? Within the next month

What is the main reason that motivates you to return? Improved safety

What is the second most important reason? Missing home

What is the main issue which currently prevents return to Khaddamari? Food insecurity

What is the second most important issue preventing return? Financial cost of return

At no point in the interview did the respondent mention that he or she was likely to return in the next month. Food insecurity or financial costs were also not cited as factors preventing return. Without audio recordings, we would never have become aware of these issues. Transcribing even just a sample of our audio recordings drew attention to significant problems with the data. Instead of blindly relying on poor quality data, we were able to triangulate information from other sources, and use the interview transcripts as qualitative data. We also included a strongly worded limitations section in the report, acknowledging the data quality issues.

We suspect such data quality issues are common. Surveys, quite simply, are perhaps not the most appropriate tool for data collection in the contexts within which we operate. Certainly, there is a need to be more aware of, and more transparent about, survey limitations.

Despite these limitations, there is no doubt that surveys will continue to be widely used in the humanitarian community and beyond. Surveys are ingrained in the structure and processes of the humanitarian industry. Despite the challenges we faced in Nigeria, we will continue to use surveys ourselves. We know now, however, that audio recordings are invaluable for quality assurance purposes. 

A manual audio recording strategy is difficult to replicate at scale

In an ideal world, all survey interviews would be recorded, transcribed, and translated. This would not only enhance quality assurance processes, but also complement survey data with rich qualitative narratives and quotes. Translating and transcribing recordings, however, requires a huge amount of technical and human resources. 

From a technical standpoint, recording audio files of surveys is not straightforward. Common cell phone data collection tools, such as Kobo, do not offer full-length audio recordings as standard features within surveys. There are also storage issues, as audio files take up significant space on cell phones and stretch the limits of offline survey tools or browser caching. Audio recorders are easy to find and fairly reliable, but they require setting up a parallel workflow and a careful process of coding to ensure that each audio file is appropriately connected to the corresponding survey.

From a time standpoint, this process is slow and involved. As a general rule, it takes roughly six hours to transcribe one hour of audio content. In Hausa and Kanuri – two low resource languages that lack experienced translators – one hour of transcription often took closer to eight hours to complete. The Hausa or Kanuri transcripts then had to be translated into English, a process that took an additional 8 hours. Therefore, each 30-minute recorded survey required about one day of additional work in order to fully process. To put that into perspective, one person would have to work full time every day for close to a year to transcribe and translate a survey involving 350 people.

Language technology can offer some support

In languages such as English or French, solutions already exist to drastically speed up this process. Speech to text technologies – the same technologies used to send SMS messages by voice – have improved dramatically in recent years with the adoption of machine learning approaches. This makes it possible to transcribe and translate audio recordings in a matter of seconds, not days. The error rates of these automated tools are low, and in some cases are even close to rivaling human output. For humanitarians working in contexts with well resourced languages like Spanish, French, or even some dialects of Arabic, these language technologies are already able to offer significant support that makes an audio survey workflow more feasible.

For low-resource languages such as Hausa, Kanuri, Swahili, or Rohingya, these technologies do not exist or are too unreliable. That is because these languages lack the commercial viability to be priority languages for technology companies, and there is often insufficient data to train the machine translation technologies. In an attempt to close the digital language divide, Translators without Borders has recently rolled out an ambitious effort called Gamayun: the language equality initiative. This initiative is working to develop datasets and language technology in low-resource languages relevant to humanitarian and development contexts. The goal is to develop fit-for-purpose solutions that can help break down language barriers and make language solutions such as this more accessible and feasible. Still, this is a long term vision and many of the tools will take months or even years to develop fully.

In the meantime, there are four things you can do now to incorporate audio workflows into your data collection efforts

  1. Record your surveys using tape recorders. It is a valuable process, even if you are limited in how you are able to use the recordings right now. In our experience, enumerators are less likely to intentionally skip entire questions or sections if they know they are being recorded. Work is underway to integrate audio workflows directly into Kobo and other surveying tools, but for now, a tape recorder is an accessible and affordable tool.

  2. Transcribe and translate a small sample of your recordings. Even a handful of transcripts can prove to be useful verification and training tools. We recommend you complete the translations in the pilot stage of your survey, to give you time to adjust trainings or survey design if necessary. This can help to at least provide spot checks of enumerators that you are concerned about, or simply verify one key question, such as the question about informed consent.
  3. Run your recordings through automated transcription and translation tools. This will only be possible if you are working in major languages such as Spanish or French. Technology is rapidly developing, and every month more languages become available and the quality of these technologies improve. Commercially available services are available through Microsoft, Google, and Amazon amongst others, but these services often have a cost, especially at scale.
  4. Partner with TWB to improve technology for low-resource languages. TWB is actively looking for partners to pilot audio recording and transcription processes, to help gather voice and text data to build language technologies for low resource languages. TWB is also seeking partners interested in actively integrating these automated or semi-automated solutions into existing workflows. Get in touch if you are interested in partnering: [email protected]
Written by:

Chloe Sydney, Research Associate at IDMC

Eric DeLuca, Monitoring, Evaluation, and Learning Manager at Translators without Borders

Marginalized mother languages – two ways to improve the lives of the people who speak them

21 February. This is the date chosen by UNESCO for International Mother Language Day, which has been observed worldwide since 2000. This year deserves special attention as 2019 is the International Year of Indigenous Languages. Both initiatives promote linguistic diversity and equal access to multilingual information and knowledge.

Languages can be a huge resource. At the same time, the mother language that people speak can be a barrier to accessing opportunities. People who speak marginalized mother languages often belong to remote or less prosperous communities and, as a result, they are more vulnerable when a crisis hits.

Yet, the humanitarian and development sector has been largely blind to the importance of language. International languages such as English, French, Arabic, and Spanish dominate, excluding the people who most need their voices heard. Marginalized language speakers are denied opportunities to communicate their needs and priorities, report abuse, or get the information they need to make decisions.

If aid organizations are to meet their high-level commitments to put people at the center of humanitarian action and leave no one behind, this needs to change. To understand better how to address language barriers facing marginalized communities, two actions can lead our sector in the right direction.

Aerial view of Monguno, Borno State, Nigeria. Photo by Eric DeLuca, Translators without Borders.

Putting languages on the map

The first is language mapping. No comprehensive and readily accessible dataset exists on which language people speak where.

TWB has started to fill that gap by creating maps from existing data and from our own research. Our interactive map shows the language and communication needs of internally displaced people in northeast Nigeria. The map uses data collected by the International Organization for Migration’s Displacement Tracking Matrix team. This data shows, for instance, that access to information is a serious problem at over half of sites where Marghi is the dominant language. Aid organizations can use this map to develop the right communication strategy for reaching people in need.

Humanitarian and development organizations can add some simple standard questions to their household surveys and other assessments to gather valuable language data. Aid workers will then understand the communication needs and preferences of the 176 million people in need of humanitarian assistance globally.

But communication in a crisis situation – or in any situation – should not be one-way. That’s where the second action comes in.

Building machine translation capacity in marginalized languages

Language technology has dramatically shifted two-way communication between people who speak different languages. In order to truly help people in need, listen to and understand them, we need to apply technology to their languages as well.

TWB is leading the Gamayun Language Equality Initiative to make it happen. We have built a closed-environment, domain-specific Levantine Arabic machine engine for the UN World Food Programme. This initiative will improve accountability to Syrian refugees facing food insecurity. Initial testing indicates that Gamayun will provide an efficient method for accessing local information sources. It will enable aid organizations to better understand the needs of their target populations, especially in hard-to-reach areas.

TWB Fulfulde Team Lead conducting comprehension research. Waterboard camp in Monguno, Borno State, Nigeria. Photo by Eric DeLuca, Translators without Borders.

We need to continue building the parallel language datasets from humanitarian and development content that make machine translation a viable option. That will expand the evidence that machine translation can enable better communication, including by empowering affected people to hold aid organizations to account in their own language.

Taking action

These two actions can help the humanitarian and development sector improve lives by promoting two-way communication with speakers of marginalized languages.  These actions will need to be expanded to be truly effective, but International Mother Language Day in the Year of Indigenous Languages is a great time to start.

To read:

    • The IFRC 2018 World Disasters Report, which includes clear and compelling recommendations about the importance of language to ensure that the world’s most vulnerable people are not “left behind”
  • TWB’s white paper on the Gamayun Language Equality Initiative

To do:

    • Consult our dashboard and think about how you can start collecting this data to inform your programs
    • Follow our journey as we continue to move forward with Gamayun (and learn along the way!)
  • Email us if you have an idea to share or want to do more in this area: [email protected]
Written by Mia Marzotto, Senior Advocacy Officer for Translators without Borders. 

Using language to support humanitarians

Humanitarian emergencies know no language boundaries.

In the 13 countries currently experiencing the most severe crises, people speak over 1,200 languages. Yet, humanitarians operating in these crises often do not have the necessary language support, making their jobs even more difficult. 

World Humanitarian Day on 19 August is an opportune moment to reflect on this challenge. On this day, we honor all aid workers risking their lives to help people facing disasters and conflicts. At Translators without Borders (TWB), we believe that language should not stand in the way of the ability of these dedicated and brave people to deliver life-saving support.

Yahaya (center left) TWB Kanuri Team Leader conducts research on how well words like "stress" and "abuse" are understood in Kanuri and whether words like "rape" and "mental health" carry a stigma.
Yahaya (center left), TWB Kanuri Team Leader, conducts comprehension research. Internally displaced people’s camp, Maiduguri, Borno State, Nigeria.

Yet, too often, aid agencies do not give their staff the appropriate resources and tools to engage with communities and local responders in a language they understand. Translation is a consistent challenge, but mostly overlooked in humanitarian budgets amid other more tangible items. As a result, humanitarian workers are often forced to rely on unsupported national colleagues, untrained interpreters, English-centric jargon, and procedures that may exclude those who speak local languages.

The consequences of overlooking the need for language support are dire for the people in need of humanitarian aid – and pretty tough for humanitarian workers themselves.

Many of these aid workers are forced to rely on national staff or local community members to act as translators or interpreters. These staff members are largely expected to deal with the many challenges that differences in languages present on their own, although translation skills are rarely what they are recruited for. Program documentation such as guidelines, manuals, and other materials including specialized terminology is translated without training or support. Some may be working between two languages when neither is their first language.

Situations where interviews with community members pass through three or four languages are not uncommon. An international aid worker may speak in English, a national staff member interprets into the national language, and then a local school teacher interprets into the language of that village, and back again. This approach multiplies the potential loss of information in translation and lacks proper quality assurance. It also forces under-supported humanitarian staff or community members to perform a stressful task with little or no confidence that people’s information and communication needs are being met.

Mustapha (left), TWB - Hausa Team Lead, works with enumerators from the Danish Demining Group / Danish Refugee Council to conduct research on comprehension of information in various languages and formats at Farm Centre IDP Camp in Maiduguri, Borno State, Nigeria.
On World Humanitarian Day, we honor all humanitarian aid workers, including our staff, and commit to ensuring language does not stand in the way of their ability to support and empower those who need it most. Here, Mustapha (left), TWB Hausa Team Leader, conducts language comprehension research in Maiduguri, Nigeria.

The fact that complex humanitarian terms and concepts in English are not directly translatable into other languages compounds the problem for humanitarians. TWB’s research in different contexts has found that even aid workers do not always understand the English concepts they are asked to interpret. For example, “violence against women” was translated into Rohingya as “violent women” and “food security” in northeast Nigeria as “food protected by guards”. Comprehension rates among humanitarian data collectors are as low as 35 percent in some places. The result may be, at best, confusion or misunderstanding, and, at worst, inaccurate data upon which response plans are built. It is also undoubtedly stressful for those trying to do their best in challenging circumstances.

A lack of language support can also undermine coordination with and involvement of local responders. When meetings are held in a national or international language, for example, local language speakers are excluded from decision-making. This is not only a matter of dignity and mutual respect, but it is also a crucial precondition for tapping into local knowledge and capacities, allowing those on the frontline of a response to avoid delays in making potentially life-affecting decisions.

In short, humanitarian aid workers are better equipped to ensure people affected by crisis receive timely and relevant aid when they have proper language support.

This support begins with collecting the data needed to plan for language needs, and resourcing those needs appropriately. Training and capacity development programs can help build translation and interpreting capacity in languages for which there are no professional translators. A library of resource materials and tools in the relevant languages can be built up for all aid providers to make use of.

As we mark World Humanitarian Day on August 19, it is time to shift our attention to how we can use language services to support humanitarian workers trying to help in the most dire of circumstances. Addressing language barriers between humanitarians and crisis-affected communities can deliver the humanitarian world’s commitment to quality and accountability across responses, helping support and empower those who need it most.  

 

Read more about TWB’s response in northeast Nigeria.

Join us as a partner to benefit from our translator community, or sponsor us and enable TWB to provide humanitarian workers with the language support they need.

Written by Mia Marzotto, Senior Advocacy Officer for Translators without Borders.

Photographs by Eric DeLuca, Monitoring, Evaluation and Learning Manager for Translators without Borders.

Bringing words to life in northeast Nigeria

yoga I recently returned from northeast Nigeria, where Translators without Borders (TWB) is providing language support in one of the most severe humanitarian crises and linguistically diverse areas in the world. Unsurprisingly, I had many conversations about language issues with humanitarian responders.

The good news is that many were already aware of the need to communicate information in languages people understand, despite humanitarian programming often disregarding local language communications. When hearing about TWB’s language support capacity, many felt relieved that someone might be able to help them tackle language barriers. The bad news is that, even with that acknowledgment, the most common refrain I heard throughout my four-week assignment was, “I have never thought about language so carefully before and neither has my organization.”

So I found myself asking, “How much is being lost in translation?” And, more importantly, “If two-way communication in the right languages in northeast Nigeria was truly integrated into programming, how would humanitarian action improve?”

The fact is that the importance of two-way communication between local communities and aid providers, in a language affected people can understand, is increasingly recognized by humanitarians.

Some of the best humanitarian programs are now consciously factoring language into their efforts to meet people’s information and communication needs. They do so recognizing that only when those needs are met can affected people reliably access assistance, provide input, and make the best decisions for themselves and their families. But despite the nod to language, mainstreaming solutions to language barriers within humanitarian work is still not the norm.

This was clear to me in northeast Nigeria.

After nine years, the humanitarian crisis remains one of the most severe in the world. In the three worst-affected states of Borno, Adamawa, and Yobe, 1.9 million Nigerians have been displaced from their homes; overall, 7.7 million people are in need of humanitarian assistance. Data shows that displaced people speak over 30 languages as their mother tongues. Overwhelmingly, they prefer receiving information in their own language. However, humanitarian responders are communicating with affected people mainly in two languages, Hausa and Kanuri. This is not enough to meet people’s needs, and serious problems persist due to the lack of two-way communication.

Humanitarian field staff shared many concerns about language needs in the response. They were unsure how to provide potentially life-saving information in camps where they do not know which languages people understand. There was concern that language diversity and low education levels prevent them from accurately gauging people’s needs and priorities. I also heard frustrations from some aid workers, particularly those who spoke local languages in addition to Hausa or Kanuri. These field workers are often asked to translate complex messages and concepts into those local languages with little or no support or experience in translation. In this situation, I wasn’t surprised that translation was seen as a considerable additional burden for multilingual staff, often an add-on to agreed job descriptions.

These conversations were both concerning and compelling. It’s no secret that for field workers in the humanitarian aid sector, day-to-day work can be more than a little complicated. Language should help, not hinder, the ability to provide effective and accountable aid to those who need it.

The problem is not a lack of awareness among field staff. What is missing is for those who direct organizational policies and program design to focus on language needs early in a response and appropriately resource language support.

To that end, it was exciting to be working with TWB’s team on the ground in northeast Nigeria. We are striving to provide that language support for humanitarian responders communicating with vulnerable people. We have already started to roll out the TWB Glossary for Northeast Nigeria – an in-the-hand tool for humanitarian field staff, interpreters, and translators to ensure use of consistent, accurate, and easily understood words in local languages.  

Yet so much more needs to be done.

The only way for this tool and other forms of language support to make a difference is by mainstreaming their use across the humanitarian response. This begins with ensuring field staff have the knowledge and resources to meet language needs in the response – and the support internally to prioritize the role of language in communication and community engagement programs. Otherwise, we risk seeing too few of these examples reach their potential for humanitarian accountability and effectiveness.

Having conversations about the importance of two-way communication in the languages of the most vulnerable is the necessary first step. Now we must move from words to action about language.

Like most things in life, it’s not what you do but how you do it.

Read more about TWB’s response in northeast Nigeria. 

Written by Mia Marzotto, Advocacy Officer for Translators without Borders.