The social media platform’s Clubhouse rival is an audio-heavy experience, so how accessible is the feature to deaf and disabled people, and what are the auto-captions like? Deaf journalist and campaigner Liam O’Dell investigates…
Credit where credit’s (over)due, Twitter seem to have finally learned their lesson after the mishap over voice tweets in the summer of 2020. Deaf people are finally involved in the process of developing new audio features, and it’s been done as early as possible, too.
“The mic is yours,” read a message when I opened up my Twitter mobile app on Wednesday evening. “You can now host and join live conversations in Spaces. Go on. Try it.” A couple of minutes later, I did just that.
Having received assurances that Spaces would come with automatic captioning from the outset, I was curious to see just how accurate these would be. With the technology being heavily reliant on voice recognition, the tool, where implemented, is still far from perfect. One only needs to look at fellow social media platform YouTube to understand just how much further the site needs to go with the software, despite introducing it way back in November 2009.
At first glance, Twitter seems to have the same teething problems experienced by any new voice recognition software. Several listeners in my Space pointed out instances where the captions interpreted my words as those which are far more inappropriate. For one user, ‘Space’ had been misinterpreted as ‘spice’.
Naturally, there are some balances to be struck when it comes to this software. First and foremost, accuracy needs to be considered alongside speed. Listeners pointed out that the captions were quick to display, but at the expense of context, full sentences and precision. Transcriptions were/split up/like this, making it difficult, no doubt, for neurodivergent folk to process, on top of deaf and hard of hearing individuals, such as myself.
Of course, I can only speak for myself, but I’d certainly take slower, more accurate captions which are longer in length (within reason), than sudden and abrupt ones. When we’re talking about access, the ability to follow conversation is an integral part of this – the former does not help. At times when other people joined as speakers, the captions took a while to load, leading to me missing parts of the conversation completely. It’s also worth noting that while a Space can be minimised to listen to while scrolling, the captions are no longer visible when you do. This was an issue for me when I had to close the Space to check my mentions for feedback (because, as I go on to explain later, the reactions available in the feature only convey so much). Someone pointed this out and suggested the captions be made visible at all times, and I would agree.
This taps into a wider issue about what is contained within the Space itself. Audience members could interact with the host by tapping on emoji reactions (raised hand, 100%, closed fist, waving hand and a peace sign, if I recall correctly), but these only convey so much. I was having to use these to check things with my listeners, but I still struggled to understand them at times. In a conversation on Twitter on Wednesday, I suggested that the platform perhaps consider adopting a set of emojis similar to Facebook Messenger’s current selection: thumbs up, thumbs down, laughter, sadness, shock and anger.
On a related point, it was only after going live for the first time that I realised the raised hand reaction might be used to signify someone wanting to speak (certain settings available to the host can limit who can automatically turn on their microphone). The issue with this, however, is the reactions are time-limited, meaning a host may not be able to remember who has expressed an interest in speaking. A better solution, as I suggested on Twitter after my livestream, would be to follow Zoom’s lead, and make it so that a raised hand can only be lowered by a participant until they’ve had the chance to speak.
Another technical point worth mentioning is just how refreshing it was to see a button from the very start asking whether I wanted to share transcripts with my audience. Putting such an ask front and centre will no doubt increase awareness of accessibility on the platform, in a way similar to what we’ve been seeing with alt text/image descriptions on Instagram and Twitter at present.
Finally, it was striking to see just how much Twitter Spaces resembled in-person conversations I had had with people before the world was in its current state. More specifically, the importance of context came up in my discussions with other Twitter users about the feature.
I think about my discussions with people where my eyes dart around the room, trying to find a speaker and then trying to ascertain what it is exactly they’re talking about or debating. Sometimes I consider guessing, if I am confident that my guess won’t be mocked if it turns out to be wrong. This feeling can come with latecomers to a Space, who have to quickly rush to figure out what is being discussed, or see a tweet of mine with an expired link. The delay in captions certainly don’t help with this feeling users can experience.
As more people joined my livestream, I had to recite the usual line about house rules to everyone (no expletives or nasty comments if you’re allowed to speak during the stream, essentially). Those who had stayed for the majority of the Space’s lifespan – some of them Twitter employees – were no doubt frustrated and fed up with my voice come the end of it, I’m sure.
There is a need to ensure that everyone has access to the same context, regardless of what time they join the Space. How, exactly, I don’t know, but it’s probably something which will be answered as I continue to play with the feature in the coming weeks and months.
For what it’s worth, the feature is incredibly promising. I can certainly see it being used by artists to give secret listening parties for their new single or album, or by those considering exploring another approach to podcasting.
There are, indeed, pros and cons with its current iteration, and it’s no doubt going to get better and better as the team at Twitter continue to work on it. As I said in a tweet after my first trial run, it’s refreshing to see a social media platform collaborate with deaf and hard of hearing people on new features as early as possible. By doing this, accessibility is more likely to be available at full launch, rather than an afterthought.
I am grateful to Twitter for the opportunity to be a part of this process, which I should stress is 100% voluntary, and is not gifted, paid for or any other alternative.
On the day this piece is published (Saturday 16 January), I will be going live once more on Twitter Spaces to further test out the accessibility of the feature, and have a more structured chat about disability, politics and more.
Follow me on Twitter @LiamODellUK for updates, and iPhone users can join me at 5pm (GMT) to take part in my next live chat. See you then.