microblog.at ist einer von vielen unabhängigen Mastodon-Servern, mit dem du dich im Fediverse beteiligen kannst.
Dies ist die private Mastodon Instanz von Robert Lender

Verwaltet von:

Serverstatistik:

1
aktive Profile

#linguistics

2 Beiträge2 Beteiligte0 Beiträge heute

Linguists, animal lovers, and infographic designers--this article is for you. A beautiful, scrolling animation providing a visual analysis of animal sounds across cultures. Meet: cat! duck! and pig! (if they spoke in IPA). pudding.cool/2025/03/language/ #linguistics #anthropology #design #animals #animation #cat

The PuddingHow do animals sound across languages?Analyzing animal onomatopoeia across languages can demystify how we shape sound into meaning.
Fortgeführter Thread

#ReproducibiliTea in the HumaniTeas is hosting a "Love Data Week" special workshop on Monday 10 February 2-6 pm CET on computational #reproducibility, focusing on the use of #Docker with Mark Ellison from the Institute of #Linguistics @UniKoeln.

Places on site @unibibkoeln (with tea and biscuits!) are limited so registration is crucial! We will also be live-streaming the workshop, but cannot provide one-to-one support to online attendees. Registration and Zoom link: fdm.uni-koeln.de/en/rdm-networ #LoveDataWeek #DigitalHumanities

Very happy to announce that my employers, the Austrian Academy of Sciences' Institute for Iranian Studies, are finally off X and onto a range of other platforms including the Fediverse.

If you have any interest in (or want to find out a bit more about) the cultures and languages of the Caucasus, Iran, and central Asia, then please do go and follow them @IranianStudies :)

Boosts appreciated!

As a kid trying (failing dismally) to learn French, I never got my head around the gendering of nouns. Coming from a language that doesn't have that it just seemed rather bizarre. However it seems to be a very common practice in languages and I wonder if anyone can tell me if 'most' languages have gendered nouns or whether (like English) they don't?
#linguistics #askfedi

Probably about time for a new (re) #Introductions post to pin! Been bumbling around on Mastodon since November 2017 on one server or another, and happy to get cozy in a smaller server this time. :ablobwave:

I’m not a heavy poster, but lean mostly towards #SciCom and pictures of the #critters and #wildlife I spot in my central Florida backyard or from adventures on my #camping trips.

I’m currently a #caregiver, formerly a professional #Library knowledge ninja. Strong believer in the #RightToRepair and lifelong #Sewcialist. I love fixing and mending things.

My #NeuroSpicy flavour is #80HD, so it may be unsurprising that my academic studies and interests are hella broad. I spend a lot of time digging into local ecology and #HabitatRestoration and preservation. I’m a #space and aeronautics nerd, but I'm also a #Polyglot with an academic background in #linguistics, adult literacy, and ESL. The other part of my academic background is the technical side of theatre.

OK, so here's a #linguistic thought that kicks around my head from time to time.

When a #transgender individual assumes or reveals a new identity corresponding to a new gender, they frequently choose a new name corresponding to the new gender. For example, a man named Brian might become a woman named Brenda. Or Sharon. Or any number of things. But she won't be named Brian. Or José. Or Ichiro.

It seems far less common to challenge the idea that names are inherently gendered than that people are.

Fortgeführter Thread

#ReproducibiliTea in the HumaniTeas is on again TODAY 16-17:30 CEST with a special session on the need to teach basic statistical literacy. Two of my M.A. students will present the results of a statistical literacy test that I developed for M.A. #linguistics students and I'll also speak about a study on researchers statistics knowledge. There'll be lots of talk about so do join us! 🍵 🍪

Join our mailing list to get the Zoom link: lists.uni-koeln.de/mailman/lis.

Fortgeführter Thread

Very excited about our first #ReproducibiliTea in the HumaniTeas session of the winter term TODAY (Monday) 4-5:30 pm CEST at the University Library in Cologne and on Zoom! 🍵 🍪

Scott Sterling from Indiana University will be joining us to talk about research ethics in #linguistics and, more broadly, #humanities research.

Join our mailing list before 11:30 TODAY to get the Zoom link and instructions to find the room (or send me a DM if you don't see this post in time): lists.uni-koeln.de/mailman/lis

If you're a #language nerd like I am, then you won't have missed the @mozilla #CommonVoice v19 #speech #dataset release - which now features 131 languages! Here's my #dataviz, done in @observablehq of the v19 #metadata coverage.

I've updated the visualisation this time around with human-readable language names instead of their ISO-639 or BCP-47 language codes to make it it easier to read.

There's some interesting observations:

▶ Catalan (ca) continues to be leader in terms of data - speaking volumes about the efforts to revitalise culture and language in Catalunya. It's also one of the few languages that has data for all age groups, particularly older speakers - this sort of data is missing for most other languages.

▶ Kiswahili (sw) is one of the languages where there is more data for female-identifying speakers than for male-identifying speakers ♀ - although Japanese (ja), Western Mari (mrj) and Luganda (lg) do pretty well here, too!

▶ Sentence domains can now be categorised, and although most new sentences are "general", Albanian (sq) has a lot of sentences related to law and government.

▶ Tsonga (ts), a Bantu language spoken in Southern Africa, has dethroned Icelandic (is) as the language with the highest average utterance duration. I don't know enough about Tsonga to speculate why - it's a somewhat agglutinative language, but many Tsonga works are generally short.

▶ Bengali / Bangla (bn) has a significant amount of data that is not yet validated, and therefore does not appear in training / dev / test splits. There is a similar case for many languages new to Common Voice - it takes time to validate.

▶ The language with the highest number of average contributions per speaker is Taita (dav), a Bantu language from Kenya.

What do you make of the data visualisation? Are there any other insights you can see?

Big thanks to the CV team for all their efforts - EM, Jessica Rose, Dmitrij Feller and Justin Grant.

#linguistics

observablehq.com/@kathyreid/mo

Observable · Mozilla Common Voice v19 dataset metadata coverageThis visualisation uses "@d3/stacked-horizontal-bar-chart" to visualise the Common Voice metadata coverage. The original data is taken from the Common Voice `cv-dataset` repository - direct link Table of contents Splits by age range - shows how many clips have been provided by speakers of different age ranges for each locale (language) Splits by age range scaled to 100% - as above, but scaled to 100% so that the metadata coverage of low resource languages is more visible Splits by gender - shows how many cl