I am deaf too and I echo the sentiments of the article. I’ve found Microsoft teams vital for me as it has auto captions enabled. Zoom doesn’t have this, so we have shifted away from using zoom completely.
Back in pre covid times, customer calls were done in a room over a conference speaker phone. I would be totally lost and would need to rely on my colleagues to help.
With Microsoft teams, I can run customer calls myself, relying on the captions (and the patience of the customer).
Massive thanks to Microsoft for spearheading accessibility. I hope you guys see this.
The first time I was invited into a Google Meet conference, the captions were on by default. I was impressed at how well it was doing in real time. I've seen closed captions with hired human operators during live events that didn't do as well as what I was seeing. Watching the captions disappear as the backspace was used to make a correction has always made live captions so human to me.
It’s anecdotal but Google probably has a ton of speech to text data from YouTube captions and such. Their hangouts captioning was phenomenal. I have full hearing but I still switch it on. It’s really good.
Don't forget that their CAPTCHAs for the visually impaired use YouTube audio, so they're building a very accurate, human-trained closed-captioning library.
One thing that might help you is the Google Recorder app. It has live speech to text and you can prop your phone next to your laptop running Zoom. It's not ideal but can help make the meeting a bit easier to follow.
Can confirm... a phone running Google's Live Transcribe propped up next to my laptop is even better for me than Google Meets captions because you can see the history. Will have to try Google Recorder.
I use hearing aids and there is still a bug in Microsoft Teams when you connect via Bluetooth. The ring tone is super loud despite adjusting volume. Other people have complained months back and it still is not resolved. Fortunately my hearing is good enough that I can put headphones on top of my hearing aids to avoid this problem.
I have this problem with Bose SoundSport headphones and FaceTime Audio. The ring and answer tones are often painfully loud. Over three years and several firmware updates, it's still not fixed.
I think Bose has become unable to focus on user experience. This is shown by: 1) their decision to remove physical buttons, 2) ignoring counterfeit products sold on Amazon, and 3) failing to make their $350 bluetooth headphones pause music when you take them off.
I work with a very international team and multiple people have accents too thick to understand. I've found that Teams and Google Meet understand them better than I do. The captioning is a godsend.
Don't feel bad! I have a thick accent when speaking English and have had colleagues ask me to try to annunciate better since they cannot understand me.
I know it's my job to communicate clearly so I make the effort to try to speak in a way that can be better understood by my team. I know my team are making an effort to understand me too.
I appreciate being told that I cannot be understood because it shows my team want to understand me and want me to be understood.
If your team can get past the sometimes awkward moment where you have to say "sorry; I didn't understand that", I'm sure they'll apreciate it too.
Hi GEBBL, I'm the Principal Lead Program Manager behind live captions in Microsoft Teams. My team and I envisioned live captions for meetings and delivered in Microsoft Teams worldwide in Dec'19. We had built it to improve inclusivity because we saw a rise in remote meetings (we couldn't have predicted COVID). We are super glad and grateful to hear that live captions have been extremely helpful and useful to you. Thank you for choosing us. Please don't hesitate to share any further feedback or comments.
Regards,
Shalendra
Hmm, about a year ago when I was on Zoom meetings we could get a transcript that was generated of the whole meeting, along with a recording of the meeting.
Was this feature removed at some point? One of my coworkers would export the meeting conversation and save it in slack. While it wasn't always perfect with the captioning, it worked well enough!
Ah! Thanks for the distinction! I couldn't remember if the transcription was happening in real time during the call, or if we just exported it afterwards. I hope the Zoom team adds real time soon!
That's fantastic and eye-opening. I imagine the next leap will be language translation and with that. The prospect of having support calls with anybody irrespective of language barriers is getting within reach.
We're getting really close already. I often watch non-English language videos on Youtube and the to-english translations are good enough for most things.
I think for technical conversations, too much is lost to nuance though.
> Is this some kind of cynical belittling sarcasm?
It looks like a legitimate question to me. One you didn't actually answer.
Are you concerned about privacy when using video chat? If so, which part? Voice, video, transcript, etc.?
I'm also interested. I assume - from a personal point of view - that the data is being stored and could be leaked, but I'm happy to accept that risk in exchange for the benefit the service provides. I've not really considered the corporate side of things though.
I worked for a company where our accessibility engineer was blind. He once took us through his world; speakers setup so we could hear his screen reader and get a feel for how he navigates websites.
I remember him starting the presentation saying, "Can everyone see this? Good because I can't."
I'm a (hearing) son in an all-deaf family, and am a bit tired to wait for every single service to become accessible.
Why not just be platform-agnostic? It's the same problem that has been going on for non-captioned videos for years.
If a captioning service is going to be exclusive to only Google Meet or Teams, then by definition, it's not inclusive.
So when we come on Zoom with my parents, they use Ava http://www.ava.me to caption what I say with their speakers on.
It looks like a caption bar, floating on top of the screen, so when I share my screen they can still see captions.
My dad started using Ava at work where they use GoToMeeting. They just won't move to Zoom easily so he gets access without having to shift the whole company on it.
The software also separates who says what neatly if you share a link in the chat and people connect to it. Also voices out what's typed on the other end.
I'm going to take this as an excuse to ask some questions about how the hearing impaired use video chat:
How much does the video component help? I'd assume that most modern video chat solutions are adequate for sign language (but you know what they say about assumptions!). I've heard that reading lips over video is difficult. Is this a case where it's possible to read maybe half of what people say, or is it just impossible to read anything?
What happens when a deaf person receives an unexpected call/message when they're not looking at their PC or handheld? Is vibration from a mobile device usually enough? Are there any visual systems for a desktop to signal the user at a distance? (I'm thinking of something like the flashing light alarm clocks.)
Is communication via sign language over video chat feasible with a mobile device? I honestly don't know how much can be communicated with only one hand free. If one hand isn't enough, is propping the device up on a stable surface enough to be able to communicate this way? (Common sense says that at this point everyone would just be texting, but sometimes you just need that human connection.)
I don't have any practical need for any of these answers. It's just something I've been curious about for a while.
I ignore all calls to my phone, my wife has set my voicemail up to say I'm deaf, to text me instead.
I've got a rather expensive system set up to every doorbell, and smoke alarm in the house that also functions as my alarm clock in which it'll vibrate the mattress (just a small circular vibration device to place under the bed).
Lip-reading over video is good enough IF you can hear enough, so because I have a cochlear implant I can do both and that's usually good enough. Without the processor on, I'm profoundly deaf - so video would be quite useless. (thus the reasoning for a vibrating smoke alarm thingy).
I keep my phone on vibrate-only.
At the end of the day I'm a little different because I rely on my cochlear implant and for some people in some communities (see deaf vs Deaf), there's quite a lot of stigma.
We say deaf & hard-of-hearing please, not hearing-impaired.
Lip reading is hard, period. Over video makes it harder when network connection isn’t great, and more people speak, so mostly difficult yeah.
Flashes / vibration are common alerts yes. I prefer flash since you might not touch the surface that vibrates.
And yes one-handed sign over video is very frequent but maybe best for 1-1 or familiar groups because indeed you can lose a bit in precision/fluidity. If you want to be accessible over video, find a stable place and sign with dark colored shirt so contrast is better ;)
Not to belittle the whole PC thing -- I get it: repeat to someone that they are at a disadvantage their whole life and they'll grow up believing it. My girl understands that her hearing makes her different but doesn't believe that it has hindered her life in any way. Having four kids pretty close in age, it's hard not to compare. My hard-of-hearing daughter is the one out of the four that will surprise us every few months with some new thing she decided to learn how to do from YouTube[0].
She's the most self-directed learner of all of my kids. I think she knows the right terminology. I know she doesn't care at all, though, and we've raised all of our kids to assume that people aren't out to offend you.
[0] She sings incredibly well -- something we didn't expect -- so a few years back she wanted to sing for the talent show but she wanted to do so with a puppet. We thought the puppet idea was a distraction and like the good parents we were, we let her demo it for us and didn't pay a whole lot of attention and tried to discourage her from using the puppet. She was super-discouraged and, thankfully, we asked her to do it again. It was at that point that we realized she was singing beautifully and ... not moving her lips at all. I think she was 8.
> We say deaf & hard-of-hearing please, not hearing-impaired.
Whoops, my apologies.
> Flashes / vibration are common alerts yes.
A quick search reveals that there's a mishmash of different ways to use a phone's flash for notifications. Are there any equivalent options for desktop/laptop systems?
In any case, thanks for taking the time to engage with me.
> questions about how the hearing impaired use video chat
I am not deaf or hard of hearing, but have a teenage daughter who is old enough to articulate some of these things. She doesn't sign (used to, and there's reasons beyond "didn't use it" that she doesn't) and what she's told us is:
(1) Video calls are better than audio-only, but just barely. If you watch her with a group of friends, she's not always looking at the person who she's directly communicating with. She uses her vision to keep up with fast moving, multi-party conversation and to fill in missing sounds. She hears sounds well into mid-range tones. Deeper and high-pitched tones are not heard at all, but she can feel the deep tones[0]. Through lousy laptop/phone speakers, it's all mid-to-high tones and -- I get the impression that people sound much different to her over speakers than they do in person, but I have no way to confirm that.
(2) Without closed captioning, she misses a lot. She doesn't read the captioning word-for-word but uses it like lip-reading -- to fill in gaps for sounds she can't hear. This happens more frequently than to me (I have very mild hearing loss; a little worse than a normal person at my age) -- sounds might be heard, but a competing sound ruins the "signal" (signal/noise ratio) more for her than it does for me. Worse, because video conferencing software likes to focus on the speaker, and the latency involved in switching to the speaking is very long, it can make the calls even more confusing.
The second point can't be stressed enough. My daughter is not completely deaf. She doesn't "lip read" in the way you see on TV[1] -- she can hear enough that she does it unconsciously. My daughter (and Mom and Dad) had no idea how much she relied on lip reading and closed captioning (it's on by default on all of our TVs so we don't even notice it). When Zoom meetings and mandatory mask wearing came along, it took very little time for us to notice she was struggling way more than the other three kids. It took about a month before we figured out why.
In her schooling situation right now (online-only public school -- one not setup specifically for COVID), video/audio conferences are somewhat minimal. Teams works good enough for her, and she's adjusted by pre-reading the materials -- basically learning whatever it is that's about to be taught to the other kids -- because it's easier than trying to learn it from the live lesson. If she didn't have to attend these sessions, we wouldn't make her. Thankfully, video/audio/live lessons are very minimal in the program her and my son are enrolled in.
[0] Something I learned (my daughter is more accurately my step-daughter, so I met her when she was about 4 years old) -- she had a terrible fear of fireworks until she was about eleven. We found out this is a really common thing with people who are hard of hearing. After a lot of questioning, it would seem that she feels the "boom" of the explosion but the corresponding sound is very quiet or just missing entirely. Combined with the difference between the sight of the explosion and the time it takes for the sound to reach your body, it resulted in her being startled, constantly.
[1] ala Seinfeld - You couldn't have her spy on someone's conversation without hearing their voice -- while she's better at it than I am, she's still about a time zone away from the actual words that were said.
Hi Quinn! Coincidentally, I've been working on a sound visualizer which accurately translates sound to visuals, while preserving all the information and, what's way more important, it transfers the "sound symmetry" which we often refer to as "musical harmony". The visuals can be translated back to sound.
As someone with nearly perfect hearing and musical education background, I find the visuals oddly accurate when I also listen to the corresponding music. It can trivially distinguish human vowels, as those produce different recognizable shapes.
I've put together a few sample images and a live demo: check out "github soundshader" (the project that talks about ACFs). I'm really curious what you think about this.
It would be awesome if there was a pre-populated example on the GitHub pages site. I’m on an iPhone Chrome right now and have no mp3s to try and the mic button didn’t do anything.
I've been thinking to do this, actually. However the mobile Chrome and Firefox don't seem to work yet, at least on Android. For example, mobile Chrome renders only a 1-pixel wide stripe moving upwards. It correctly changes color, so all the machinery around GLSL must be working, but I have no idea why the rest of the image is pitch black.
Hmm, I was hoping it would help with pronunciation, visualizing the difference between saying "da" and "ba" etc but it doesn't seem to be sufficiently fine-grained for that
Every mp3 and an attempt to use mic results in "AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported" error message to appear in Firefox 82.0.2 on Windows 10.
Try adding `?sr=44.1` in the URL query to set custom AudioContext sample rate matching the mp3 file. Usually it's 44.1 kHz, so it's the default value. On desktop Ubuntu, both Chrome and Firefox can seamlessly rescale the sample rates, but I haven't tested the app anywhere else.
I've noticed that since everyone has gone remote, everyone wants to Zoom more. Things that could be a short discussion in Slack or a quick comment on a github PR are now a full-fledged Zoom meeting. I end up taking notes, which makes the meeting even more slow and awkward. If I were deaf, it would be impossible to do my job.
The impromptu 1:1 Zoom meeting is tech’s rediscovery of the telephone call. Those were a staple of office work for good reason. The alternative is typically a scheduled meeting, which feels much heavier. Phone calls also end when they’re done as opposed to filling an arbitrary block of time.
It's faster, but it's also (from my experience) a bit more taxing and often a bit draining. Unless it needs to be resolved quickly or it's a particularly complex issue, I much prefer the relaxed back-and-forth of a slack chat.
So don't become Lumbergh from Office Space, and read the room/audience.
Use at your own discretion.
Same goes for using the /zoom in slack - if a back & forth is ensuing it may require a faster latency than slack can provide so a call which can upgrade to video/screenshare is nice.
Oh, I'm the last one to start a call like that, because I know how bad it is to get stopped in your tracks and diverted when you're on a roll. Not that calls and conversations can't be useful, I just... don't like it when people think that's the only way to communicate.
I'd love to see the data, but I suspect that number of meetings and hours lost that way increased when we switched to remote. Not that we had need to coordinate more, but the fact that the company was no longer constrained by the number of meeting rooms available. Now you could schedule any number of meetings at any time.
Before Covid, you had to look in the schedule to find an available room. If there was none, you had to postpone the meeting or try to do it informally.
If I was developing Teams or a similar software, I'd allow companies to institute virtual meeting rooms that would function like the real deal. You can do impromptu 1:1 calls, but anything more than that requires a virtual room with a schedule. Another thing that could be useful is productivity metrics of hours spent in meetings (it's possible it exists).
"Another thing that could be useful is productivity metrics of hours spent in meetings"
In my work I am required to log my day-to-day activity (for external client billing), including all of my time in meetings. This time breakdown goes into my monthly report. I do not call any meetings. It is always the PM or another manager setting up meetings and wanting me to be in it. The PM, and the CTO, and other higher ups get very upset, at me, when they see that I log between 40 and 90 hours per month in meetings. I am to blame for this apparently. And they also wonder why development is slow. But this is also a case of more managers than there are developers. It is also why I am looking to move on at the earliest opportunity.
Exactly. I feel that if we brought up metrics from before Covid and after that there would be a sharp increase in amount of meeting hours, in part due to unrestricted number of potential meetings. Having hard data could allow companies to make some changes and improvements.
This is just due to lack of feedback. If I get sent a zoom invite and there isn't a clear reason I know of to accept , I always ask in chat if a meeting is necessary.
I find more than 75% of video meetings are worthless, and this system has worked well.
It’s been a practice of mine for years and several coworkers are only learning about it for the first time:
I either decline meetings with no agenda or nothing in the body, or leave them tentative and reply back with nothing but a question mark to the sender. Mostly decline.
It conveys more information than the request, while demanding less.
"Meeting, tomorrow, 11am" - "I need your time but can't be arsed to tell you why"
"?" - "What's this about, then?"
Personally, yes, I would respond with a full sentence. But it would mean the same thing, and doesn't do as good a job of discouraging the rude behavior.
> Personally, yes, I would respond with a full sentence. But it would mean the same thing
It has the same semantics, but the tone is entirely different.
> doesn't do as good a job of discouraging the rude behavior.
Passive aggressiveness doesn’t discourage people from the behaviour, it just makes them think you’re an unpleasant person. If you want somebody to change their behaviour, tell them what the problem is and ask them to change it. “?” doesn’t do that.
At the same time, requesting a meeting with 0 information about the purpose is pretty rude. That kind of request is a pretty good indicator it's not going to be a productive meeting if the topic can't be concisely written.
i do this but would never reply with just a question mark (i'd politely decline or ask for additional context in order for me to attend). if i have no context for a meeting invite and it's lacking an agenda, i don't think it should fall to me to figure out what it is -- that's the onus of the organizer, in my opinion.
i am open with those i work with about this policy and i have never had anyone take issue with it, in fact i've gotten the opposite -- positive feedback and others adopting similar policies. i think a key part of this is openness and positive and direct communication, though.
I try to avoid unnecessary Zoom calls all the time, but man I miss seeing people and asking them how they are. Lots of communication is nonverbal and video fills that gap on an okay level.
WFH doesn't work well for everyone and I noticed that it impacted my work significantly and coming up ways to mitigate this is not simple nor trivial.
> The team immediately spotted the tradeoffs: while Meet’s captions were built in, there’s no history, meaning each person had to dedicate their attention to the captions to make sure they didn’t miss anything.
I actually wrote a js bookmarklet to solve this problem by continuously capturing the caption updates and writing them into a separate window. It's a little janky, but works (at least until Meet releases the next update that changes the obfuscated class names in the DOM). https://zlkj.in/bookmarklets#record-meet-captions
The captions feature of Google Hangouts is great if you can't hear well, although sometimes it is akin to watching a "bad lip reading" video on You Tube which I'm sure nobody else notices because I'm the only one with them turned on.
The TDD service hasn't changed since the '90s (1890? 1990? debatable) so I avoid it. It might have been amazing in the '60s but it is a punishment to use today compared to chat, messaging, e-mail, almost everything else.
Most banks don't seem to get this and it is precious few that offer secure messaging as an option - First National Bank of Omaha is the best I've found. Citibank has excellent support via chat but all of their financial products are absolutely the worst and their fee structure should be considered usury. CapitalOne has amazing financial products but no provisions for the hard of hearing - if you send them a letter they will send you one right back saying "Call our 800 number" unless you explicitly write that you are writing them because you can't use the phone. Thanks, CapitalOne, really helpful.
The absolute worst is Amalgamated Bank, as soon as they found out I couldn't hear they closed my account and sent me my funds via cashiers check - two months days later.
> The absolute worst is Amalgamated Bank, as soon as they found out I couldn't hear they closed my account and sent me my funds via cashiers check - two months days later.
That's terrible. They wouldn't get away with that here in the UK. Discrimination against those with disabilities is illegal. If they tried that the banking ombudsman and the courts would come down on them like a ton of bricks.
I'm not sure which ones are available in the USA but I've used a few different "New Banks" here in Europe and have had fantastic text-only conversations with banking staff.
Monzo in the UK, and N26 elsewhere. (Looks like N26 is available in the US now too).
> For me, it was fascinating to see how our interactions changed over the course of the call. Everyone became more expressive in physical space.
I found this to be extremely useful, even when talking to people with normal hearing on a video call. Because the call omits so much context that is obvious when talking in person, it helps to overact emotions, especially if you don't want to take valuable speaking bandwidth. (Or worse, talk over someone, which happens much less elegantly over video call.)
I took to overacting pretty immediately in the beginning of lockdowns, but noticed that I was one of the few doing it. And I am still not sure how to broach that conversation with people. "Hi, I notice that you're acting normally on this video call. It would be best for you to comically overact your emotions so that people can get a read on how you are reacting."
Some examples include nodding - moving almost from chin touching chest to staring at the ceiling, smiling - mouth wide enough to almost hurt, with a bit of space between the top and bottom rows of teeth, thumbs up - holding it right next to my head for 3+ seconds, and reconsidering something - hand over chin, almost covering mouth, with my entire head oriented up and away from the camera.
I've gotten good feedback on it, so I don't think it's just me being weird. But it does sound odd now that I've written the examples out.
There's another really difficult challenge for hearing-impaired people not mentioned here: it's mentally exhausting to follow online meetings, even with captions. Hearing impaired people have to very intently while also trying to make sense of lagging/imperfect captions, and that's a high cognitive load to sustain for long periods.
Accented English is particularly challenging, because otter.ai (used in Zoom) has very poor accuracy with the most common accents we encounter in software engineering.
Absolutely. This was actually something my team brought up—it was much more exhausting than usual, because it places such a heavier mental load on everyone. In my case at least, it’s still better than lip reading in large in-person meetings.
We have tried both Meet and Zoom for the ASL class I am taking. Both are fine until the instructor needs to present a powerpoint. Zoom stops showing everyone's video. Not sure if there is a difference between Zoom desktop app and Zoom chrome extension.
For Meet, it is nice that the instructor can present from a tablet but there is no way to hid the presentation video so it is way too small to see everyone signing.
I am not a huge fan of a lot of Microsoft products, but I have been very pleasantly surprised by the Teams desktop app for video calls. With a decent sized screen, it seems to work pretty well at keeping everyone "present" on the screen, though I haven't experienced using it with people who only speak visually.
With the Zoom desktop app you can click a button on the top right to switch the view. This allows you to see the presentation and the gallery view at the same time (make sure to have a recent app version). With the mouse you can resize the view between presentation and videos if you like to make the videos bigger.
If you have two screens for Zoom you can keep one screen for screen share and see everyone else on the other. I think there might be a way to turn on this dual window approach on a single screen too so you can control how much screen real estate each gets.
Zoom can show face video if you don't maximize the screen - the participants show up on top. Or use a 2nd monitor. Zoom was the first one to capitalize on 3 monitors - I've used a bidirectional share + participants on the 3rd screen.
I don't think it shows everyone at once though? Which is pretty important in an ASL class. I only have the chrome extension which doesn't support popping videos out to separate monitors.
It's a feature of Zoom, meetings can be set up to force presenter control. I'm not sure if there's a way to allow individuals to bypass it, but it is something the meeting-creator can turn off.
I will often watch movies with them one, it might be that I grew up with them as my father is deaf; but I find they give me more information than speech alone.
Me too, captions and subtitles allow me to watch videos while listening to music, or watch videos in public settings where I either didn't bring headphones or don't want to wear headphones, and allow me to watch 2 videos at the same time in which both of them are filled with 80% fluff and 20% content, as they usually are.
I've been in hospital recently after suffering a large oil burn to my left arm, and part of my chest. As most of us probably know, wards get very noisy. I only ever used my cochlear implant processor when I'd see the team of doctors first thing in the morning, otherwise I'd just chill and watch my laptop with captions on.
Absolutely shocked me every time I'd turn my processor on and the world would roar into my ears. "How the fuck is anyone sleeping" I'd ask myself.
I am Deaf and I live in a country with diglossia. This means, transcripts fail if customers or colleagues switch to the other language (Swiss German, a bit like Dutch in being different from German). Additionally, there are sometimes customers talking in French or English. This means, video conferences usually don't work for me. I either correspond by email only or have a Signed Language interpreter or a human transcriptor.
But nowadays I am quasi-retired anyway because of issues related to being limited in other ways and I am happy that way because I avoided some bullshit that comes with working for and with the corporate world since a few years. My quality of life has risen.
I still program just out of fun and I found another niche: counseling in life and juridical questions for Deaf and HoH people, and the little money is welcome.
You are welcome to ask here specific questions to avoid issues for Deaf and HoH people.
This is a great article and being deaf myself, I have very similar experiences. Some thoughts...
I'm a heavy Otter user and find it well worth the subscription - I used it a lot even in person, and have used it more since we all started working from home. We use Teams for most of our meetings and I always switch the captions on there; one or the other sometimes lags, but usually not both at once. For family video chats, we use Google Meet, and it's usually on par with Otter and Teams.
I have read a lot of positive comments of Google's Live Transcribe and Recorder apps, but can't evaluate them since I'm an iPhone user. Between this and the other hearing a11y features in recent Pixel phones, and me having one foot out of the Apple ecosystem anyway, I give serious thought to switching to a Pixel almost every year and one of these days I'll actually do it.
Worth mentioning: Google added the Pixel's Live Captions feature to Chrome and it's in the beta and main channels, but it's hidden behind a flag AND a setting, so you have to enable the flag, restart Chrome, and then confirm in settings that captions are turned on. I believe this is entirely on-device and it does work astonishingly well; I've used it on programming streams on Twitch and it handled the technical vocabulary better than I expected , if not flawlessly.
A US-centric point I should mention: for companies above a certain size, the Americans with Disabilities Act requires that they make reasonable accommodations for you. I have done this in the past to have a professional captioner. They're more accurate, but inconvenient to schedule, usually have a bit more lag, and very expensive ($125 an hour was about average when I last had to get quotes for this). Not everyone wants to make the accuracy tradeoff and that's fine - I understand their position - though I'm personally willing to because it makes it very easy to jump on a call with a coworker whenever I need to.
Quick notes on my setup: I have a 3.5mm splitter hanging off my work laptop's headphone jack. One side goes to a ReSound Multi-Mic (which includes a 3.5mm jack) to stream the audio to my hearing aid. The other side has a TRS-to-TRRS cable going into my iPhone (via the audio to lightning adapter) or iPad to provide an audio stream for Otter to work on. USB webcam provides the microphone. I typically use my iPad since I can set it up under the monitor my video call is running on and then I don't have to use screen space on my monitors to display the captions.
I really should write up a blog post about all of this, with photos.
FYI, Zoom is getting built-in automated live captions very soon. It will be produced by Otter.ai, which already produces the recording transcriptions. With 3rd party live captions, there’s an option to see all the text in a sidebar so I assume it will also be possible with their automated live captions.
For anybody interested the following equipment works the best:
1) Oticon hearing aids + TV streaming device connected to iMac via audio interface.
It streams the audio from the computer to the hearing aids .
2) Teams + captioning. In order the person needing to see the captioning to see it he has to initiate the meeting and sharing the link. The link never expires and could be saved in Notes from all members of the group. The members of the group needs to share the meeting time via other means for example email , text message.
Compared to Zoom, Google meetups Team has superior video, audio and captioning capabilities.
There are desktop and iOS versions of Teams. No way to use Otter reliably with hearing aid device because it requires Chrome browser and needs physical sound from the speakers to work. Internal networking the sound is not working.
3) Phone calls - CaptionCall. It is free service paid by the FEDeral government. You sign up and receive dedicated phone number which you enter in settings for call forwarding. When you are calling or somebody is calling you CaptionCall is intercepting the call and forwards it for machine learning+ person transcriber. This is the best combination.
4) Messaging across devices -WhastApp. No video or audio captioning in iOS so - using only messaging. Used for announcing Teams meeting.
All other solutions do not do anything to accommodate-WebEx, BlueJeans, etc.
Microsoft is having the best accommodation service - there is dedicated web site with free 24/7 support for people needing accommodation. One Teams account is just $5/month and the support is responding within several hours.
I love how many different approaches there are for setting things up!
One quick note about Otter: I’m using LoopBack to create virtual audio devices that let me feed the audio as an input source as well as play it through an output channel. That might work with the hearing aids.
(I use somewhat outdated Phonak Naida V SP’s – looking forward to the next gen that have better Bluetooth support.)
With Otter it is complicated. It iMac needs the sound to be split but Safari is not allowing it. So they recommend Chrome and changing it’s settings. Then setting up aggregate device. But it is not working. The only way it is working is if the sound is also coming out from the speakers and the microphone is picking it up. Too complicated, there is a feedback and no streaming to the hearing aids. So Zoom + Otter is out of the picture. Teams has the captioning and the best quality. What happens is a lot of companies are demanding in using only Zoom. This is a misters why they just don’t use the link of the Teams invite. If they don’t have it installed it is working from the browser too.
I'm on the flip-side of this. I have enough hearing that it is only sometimes a barrier. But I have an associated speech impediment. Voice recognition is always garbled.
I was expecting the article to be about sign language over video and whether it's a good experience or not. Does anybody here have experience with that and would they recommend it over captions?
Also, for someone who picked up sign language later in life, how difficult has it been to learn in your experience? Would it be a viable option to have a whole company take sign language courses in order to help become more disability friendly or would that be too much of a time investment with little reward?
This is how I handle Zoom meetings as partially deaf. Its fairly accurate and plus you can use a second mobile too. Downside is that it will pick up everyone, including your voice.
I do the exact same thing as well. Since I am a federal employee, I tried https://www.sprintrelay.com/federal at first when we had dial-in conferences but it was hard at first - I requested that people start their sentences with their name and to whom they were talking like so: "This is Alice here, Bob, did you..".
We moved to Zoomgov and it was hard at times to get captioning scheduled so I moved back to using my phone (which is what I used pre-covid at our in-person meetings) with Live Transcribe. We have the host with OneNote open and everyone has submitted notes so we know the topics that are going to be covered and can use it as sort of meeting minutes for anyone else that missed the meeting or to review what topics were covered.
My hearing is mostly fine, but I've looked for decent options to use for dictation, and Live Transcribe is the closest I've found to one with enough precision and speed to be tolerable to use as a replacement for typing...
I wonder if there's a way to build a video codec that does automatic lip detection and sends the lip data in much higher resolution than the rest of the stream?
I'm glad to see someone bringing these points up. I have been dealing with mild hearing loss for a little while now and have found Teams closed captioning to be extremely helpful, but my hearing is good enough that I usually manage fine on my own.
My daughter is hard of hearing, requiring hearing aids for help. She also relies a lot on lip reading to disambiguate sounds that she has difficulty making out. We learned this year that her reliance on lip reading and visual cues was much more than we ever realized. In large groups of people, she relies on lip reading to help determine who the speaker is.
When our school announced that it was doing "Zoom Meeting School"[0] and if classes resume in-person, mask wearing would be mandatory. It's tricky enough getting her accommodations at school -- and even with everything they offer, we knew that not being able to rely on lip-reading was going to put her at a huge disadvantage. We immediately enrolled our children in a local, public, online-only middle/high school. They have a few live-lessons a week, on Teams. The rest is reading, taking tests, writing papers and video-learning.
Out of the two children who were moved to online school[1], our daughter had the easiest time adjusting, but all of them are doing better than they were in the traditional program[1]. We'd noticed a few years ago that she was the kid who would hop on YouTube and decide to randomly learn how to do something. I partly wonder if her having a harder time understanding teachers has caused her to discover that she's a more efficient learner when she controls how it's done.
We were initially alarmed when the first month of classes went by and the kids were finishing a day of school in three to four hours, tops (Friday was 30 minutes!), but we were the only family in our neighborhood who left the district and by comparison, they're about a lesson ahead in everything and in addition to getting the best grades they've ever gotten, they actually know the material. It probably helps that my high-school son, who's getting into some pretty intense classes, has the benefit of 9 hours of sleep at night, and can start learning with an alert, wide-awake mind.
Incredibly, both of my kids are still on the fence about going back to in-person. I think a few more months will change their minds. Their choices: (a) Wake up at 05:30 so you can look appropriate enough to avoid getting picked on, stand out in the cold waiting for a bus, spend 7 hours in an old, cinder-block building that has more design elements in common with a prison than a place designed to foster creativity and learning vs. (b) Wake up at ... whenever ... finish up 3-4 hours of work while having every convenience of home that isn't allowed at school and have the freedom to experience life for the hours your friends are covertly sleeping through Math class.
[0] Our district has decided that the best approach to educating children is to simulate the classroom experience via a video call that the students must attend all day (with a two hour break).
[1] I have an odd parenting situation; two of my kids are home-schooled (parent-directed without professional educators) and two are (now) in online public school through a local school district.
"we need to make work more inclusive for people living with disabilites" - statements like this are pretty popular, and they sound nice to the first order. However, I have never once seen a statement like this accompanied by a utilitarian/economic analysis of whether this would actually be socially beneficial given the (usually) small number of people involved compared to the (often) large cost of making the change. And whenever someone says "would this really be worth it", responses vary from "Wow, why do you hate the disabled?" to "You should be glad you're not disabled" - in any case, shaming the person asking this question rather than addressing the (reasonable) question.
About 2 in 1000 people in the US are deaf, and most of those are deaf due to old age, so they are not working anymore. I think it's very unlikely that the marginal optimum is to spend more money optimizing for this subset of users.
I would assume that there usually isn't an immediate economic benefit of doing so. Fortunately, we live in a society that is much more inclusive than it used to be. As Tim Cook said, "When we work on making our devices accessible by the blind, I don't consider the bloody ROI".
Aside from the monetary benefits (or lack thereof), if one product focuses on accessibility and another doesn't, the small subset of users who need those features may encourage or force a larger number of users to use the more accessible product. So long-term, I think there is a benefit, even aside from helping make society more inclusive.
This sort of commentary betrays a stiflingly first-order thought process.
Sure, being "more inclusive" is a benefit - this is included in any utilitarian or economic analysis of the change. However, instead of spending X marginal dollars being more inclusive to deaf people, you could also spend that money on any number of other outcomes (pick your pleasant-sounding cause: cancer research, homeless outreach, just making everyday people's lives a bit better).
If you don't understand how "monetary benefits" tie in to scalable social optimization, you should probably not offer opinions on which causes people (and companies) should put money into.
Also, keep in mind that you have ~zero skin in the game here, and Zoom has a lot, so you should lend a lot of credence to what they choose to do.
One day, you might be. Just be glad you won't be left behind.
Edit: If you want a purely numbers based analysis, well, you're not considering second-order effects. Investment in accessibility represents an investment in technology. ASR in meetings, for example, is a necessary first step towards good auto-translation.
This is the exact stupid response I mentioned earlier: “be glad you’re not disabled!” How about you get a real argument instead of just shaming people?
You do realize you’re saying “It’s not worth including some people,” including those with hearing loss, as if their inclusion is worth less than everyone else.
Inclusive design recognizes that by simply recognizing the challenges the extreme ends of the spectrum of ability may be facing, you can design outcomes that create a better experience for everyone, not just those with permanent disabilities. It’s no harder or even necessarily more expensive to design for inclusion from the beginning—it’s a mindset that leads to better, more inclusive, and more human products.
Being charitable to centimeter, it sounds more like they're saying that their inclusion is more expensive than everyone else. And, well... it is.
That being said, the argument only works on purely utilitarian grounds. If you have any sense of deontological ethics, it kind of goes right out the window (even if you agree with the conclusion, the argument itself doesn't make much sense from a deontological viewpoint).
I do agree with this, and it’s one reason I’m a huge proponent of automated speech to text even if accuracy isn’t perfect. Having to bring in a third party to transcribe is expensive and exceptional and best when accuracy is crucial, or it’s serving a large audience. I’d rather we continue to push for better automated speech to text by default, everywhere.
I'm repeating this from below because it needs to be visible. Your statistics and conclusions are wrong. You're selling this fiction that deafness is just an "old persons" thing. You're cherry picking perhaps a statistic about people being totally deaf, but inclusion and accommodation doesn't just start when someone is fully deaf.
Among adults aged 20-69, the overall annual prevalence of hearing loss dropped slightly from 16 percent (28.0 million) in the 1999-2004 period to 14 percent (27.7 million) in the 2011–2012 period.
Hahahaha, awesome - instead of substantively addressing the argument, just go with “hmm, your comment sounds pretty racist, despite having nothing to do with race”. Killer strategy
Making things accessible has positive unintended consequences. Building curb ramps doesn’t just help people in wheelchairs, it also helps parents with strollers and tourists with luggage. Adding subtitles to videos helps both deaf people and language learners. Making websites usable with a screen reader also makes it easier to build automated tests. Disabilities are also not always permanent or complete; people with low vision can also benefit from screen readers even if not legally blind, for example.
Specifically for technology, you also need to understand the true life changing difference it can make on a disabled person’s life. My dad is blind, and thanks to screen readers on his computer and phone, he can read books, make Zoom calls for work, correspond with people via email and WhatsApp, and so much more. He’s in his 60s too, and technology has allowed him be an independent and productive member of society.
The question is how to most effectively benefit everyone. Sometimes that means not investing a large marginal amount of money into one particular kind of disability.
I’m hard of hearing and need automatic meeting captions. At a company that uses Google Meet, these are free. If the company is on Zoom and needed to subscribe to a third-party transcription service, let’s say that’s maybe $25/month, which is a fraction of my hourly wage as an engineer. It’s possible for me to follow Zoom meetings without captions but presents a high enough cognitive load that 2-3 meetings in a day will exhaust my mental capacity to do engineering work for the rest of the day — so the cost of the month’s transcription service is at least an order of magnitude less than the cost to the company of forcing me to struggle through meetings.
I disagree that we should evaluate disability accommodations economically, but in this case the economic argument works out, too.
If you include HoH people, people who can't access audio at the moment, people who don't want to use audio at the moment (maybe have some music going on) etc. The number of people whose lives will have been improved through implementing usability for Deaf/deaf people will probably be a significant portion of the population.
I'm not HoH/deaf/Deaf but I know I'd definitely love the opportunity to have appropriate captions during calls because I sometimes struggle to pick up words when they're spoken quickly to me, especially on highly technical or complex subject matter as required for my job.
No, you are wrong. The statement you make is true, but the implication is utterly false. You're selling this fiction that deafness is just an "old persons" thing.
Among adults aged 20-69, the overall annual prevalence of hearing loss dropped slightly from 16 percent (28.0 million) in the 1999-2004 period to 14 percent (27.7 million) in the 2011–2012 period.
For people who want comprehensive data rather than this hopelessly wrong 'factoid' underestimate, for the UK at least, check out the following 1000-page study:
It has extensive tables that break down the types of hearing loss found across genders, age, employment types, etc, etc. It badly needs a TL;DR to communicate the key points to general readers, but 25 years it was written, is still cited by the UK's National Health Service as the primary reference for those who want to understand the prevalence of deafness.
Back in pre covid times, customer calls were done in a room over a conference speaker phone. I would be totally lost and would need to rely on my colleagues to help.
With Microsoft teams, I can run customer calls myself, relying on the captions (and the patience of the customer).
Massive thanks to Microsoft for spearheading accessibility. I hope you guys see this.