Advertisement

We need your help now

Support from readers like you keeps The Journal open.

You are visiting us because we have something you value. Independent, unbiased news that tells the truth. Advertising revenue goes some way to support our mission, but this year it has not been enough.

If you've seen value in our reporting, please contribute what you can, so we can continue to produce accurate and meaningful journalism. For everyone who needs it.

Siri can't speak Irish: Tackling the digital gaps for the Irish language

Irish researchers are working with colleagues across the EU to tackle language inequality for minority languages online.

SIRI CANNOT SPEAK Irish. Neither can the autocorrect on your phone. You can’t Google Seán Breathnach’s movie Foscadh using speech-to-text technologies – or anything in Irish for that matter.

Whether it be Irish, Croatian, Lithuanian, Basque or Maltese, technologically-aided communication is much more difficult in some languages than in others such as English or German. Maybe this was trivial ten years ago, but in our increasingly digital world, the situation is believed to pose grave risks to the future of the Irish language and other European languages.

There are currently more than 21 European languages in danger of ‘digital extinction’, when a language becomes less relevant in daily digital life and subsequently becomes less spoken offline; an already shrinking space. In 2018, the European Parliament passed the Language Equality in the Digital Age resolution, which led to the establishment of the European Language Equality Project (ELE), a 52-partner research programme led by Dublin City University, which aims to help Europe achieve full digital language equality by 2030.

“It is ambitious,” says Dr Teresa Lynn, an expert in computational linguistics and author of the ELE’s Irish Language Report which was published earlier this month. “If you’re in the field, you know that means there’s a lot of work to be done in the next eight years.

“Right now we’re identifying the gaps for over 80 EU languages in order to get a clear picture of how things stand, before setting out a strategic agenda and roadmap.

Everything in this space happens for English first, because the investment usually comes from the big tech companies; new technologies are always driven by market demands.

Relying on the market in this context however, was never going to work out for comparatively smaller languages. In the Irish language context, where all Irish speakers also speak English, Gaeilgeoirí are forced out of using Irish in many contexts due to the technical difficulties that come with it.

Dr Lynn says “people will shift to English and say, ‘autocorrect or predictive text drives me mad’ just because it doesn’t recognise Irish, or ‘I can’t speak to Siri or Alexa in Irish, I’ll have to speak in English instead’ and, slowly, you get this language shift happening – unwillingly maybe unknowingly.

So, as much as people might want to use Irish in their daily lives, the technology is almost forcing them into English. That’s just a simple example of what’s going to happen across Europe, if something doesn’t change in terms of the technologies made available.

Web campaign tools like SurveyMonkey, according to Dr Lynn, have options like a “smart assistant component, where it will analyse your campaign, read what you put into the emails you’re sending and come back and say, ‘these are the key words for a successful campaign, these are the important things, or you should shorten the phrase, and so on’.

Getting AI on board with Irish

“That’s fine if your campaign is in English, because it’s able to understand English and make these recommendations.

“This type of language AI doesn’t exist for Irish,” she continues “you might think that’s no big deal now but in the future, it is likely to be. And we should ask ourselves why can’t those who want to do campaigns in Irish avail of the same advantages that those who are doing it in English avail of?”

Getting algorithms to learn how to deal with a morphologically rich and unique language like Irish so they can perform translations, auto-generate subtitles, give good search results or automate help systems and so on requires a lot of work. 

One of the reasons it doesn’t exist in Irish is that much of these kinds of tools are built using machine learning, where the AIs are trained in how to perform these linguistic tasks. For example, machine translation systems need to have first seen huge amounts of previously translated text that’s professionally translated in order to make new translation predictions.

In addition, for other tools, the training text or speech data might need to be given extra linguistic information which would require annotators trained in linguistics. There are very few people qualified to do this for the Irish language.

  • Hear Dr Lynn speak on The Good Information Project’s Open Newsroom webinar panel on this and other challenges to language equity in the EU>

What’s worrying is not necessarily the situation Irish speakers find themselves in right now. What’s worrying is how much digital language inequality has grown so far and the potential it has to develop further.

Linguistics and the economic divide

According to the ELE’s Technology Deep Dive paper published in February, it is predicted that by 2025, 50% of knowledge workers will use an AI-based virtual assistant, a technology available for major European languages but not currently for minority ones like Irish. In a non-Irish context, the ELE points out how this would exacerbate an economic divide where some countries will gain advantage while others, who don’t have a major European language commonly spoken in their country will lag behind.

“This is why the European Commission are saying okay, we have to do something to address this because technology investment can’t just be market-driven alone,” reminds Dr Lynn.

“But it’s not as simple as people sitting back and waiting for this inequality to disappear now that the ELE exists,” she says.

There needs to be a mindset change of Irish people realising that this really is an issue, of younger Irish speakers taking an interest and saying I want to study computer science because I want to build these systems for Irish.

Developments in virtual assistants are coming rapidly, although only available in a select few languages. The virtual medical assistant market is expected to grow from $1.1 billion in 2021 to $6.0 billion by 2026. Likewise, aspects of services in the legal, banking and insurance industries are becoming more and more automated for the customer’s ease.

But what happens when that customer’s first language is not the language of the automation, or vice versa? What tedious manual processes will remain available to them in the future? Will they have to default to a supported language? What does this tell younger generations about those lesser supported languages? That they are lesser?

The future of sentiment analysis of online political commentary is another worrying area for multilingual countries such as Ireland or Malta. In sentiment analysis, AI systems gather data from texts like news articles and social media and so on, and automatically analyse data on what is being said. The UK government have employed this kind of AI to analyse the feedback provided by citizens in their gov.uk website.

Exclusion of voices

Of course, only the opinions or comments of those in the technologically supported languages will be represented: meaning the technology either cannot be implemented in multilingual countries, leaving the political systems of those states stuck in the past, or meaning the exclusion of voices in minority languages from the political process.

The ELE knows the urgency of this situation, and understands that the power to change things also lies in the hands of people – those with the tech expertise who can build these systems in different languages and those who want their language supported.

“Take Ireland for example, you’ve got large multinational tech companies enjoying the advantages that come with having Irish EU headquarters, but really, are they taking enough interest in supporting the local Irish language?” says Dr Lynn.

Since the markets work off demand, she asks “is there also enough demand coming from Irish people saying, why is our language not being considered?” 

This work is co-funded by Journal Media and a grant programme from the European Parliament. Any opinions or conclusions expressed in this work is the author’s own. The European Parliament has no involvement in nor responsibility for the editorial content published by the project. For more information, see here.

Readers like you are keeping these stories free for everyone...
A mix of advertising and supporting contributions helps keep paywalls away from valuable information like this article. Over 5,000 readers like you have already stepped up and support us with a monthly payment or a once-off donation.

Close
40 Comments
This is YOUR comments community. Stay civil, stay constructive, stay on topic. Please familiarise yourself with our comments policy here before taking part.
Leave a Comment
    Submit a report
    Please help us understand how this comment violates our community guidelines.
    Thank you for the feedback
    Your feedback has been sent to our team for review.

    Leave a commentcancel

     
    JournalTv
    News in 60 seconds