Posts tagged linguistics

Soda/Pop/Coke: How Americans Talk

In 2003, then Harvard professor Bert Vaux conducted the Harvard Dialect Survey, in which he interviewed tens of thousands of Americans about how they talk, and released the results here

In 2012, graduate student Joshua Katz used the data to create a beautiful set of interactive dialect maps.

And in 2013, The Atlantic called up a lot of people, asked them some of Bert Vaux’s questions, layered them over maps inspired by Katz’s and made the video above.

Traveling Through Layers
Via the Canadian Journal of Communication:

When the time came a few years ago to find an Inuktitut term for the word “Internet,” Nunavut’s former Official Languages Commissioner, Eva Aariak, chose ikiaqqivik, or “traveling through layers” (Minogue, 2005, n.p.). The word comes from the concept describing what a shaman does when asked to find out about living or deceased relatives or where animals have disappeared to: travel across time and space to find answers. According to the elders, shamans used to travel all over the world: to the bottom of the ocean, to the stratosphere, and even to the moon. In fact, the 1969 moon landing did not impress Inuit elders. They simply said, “We’ve already been there!” (Minogue, 2005, n.p.). The word is also an example of how Inuit are mapping traditional concepts, values, and metaphors to make sense of contemporary realities and technologies.

It’s too perfect, no? — Michael.

Traveling Through Layers

Via the Canadian Journal of Communication:

When the time came a few years ago to find an Inuktitut term for the word “Internet,” Nunavut’s former Official Languages Commissioner, Eva Aariak, chose ikiaqqivik, or “traveling through layers” (Minogue, 2005, n.p.). The word comes from the concept describing what a shaman does when asked to find out about living or deceased relatives or where animals have disappeared to: travel across time and space to find answers. According to the elders, shamans used to travel all over the world: to the bottom of the ocean, to the stratosphere, and even to the moon. In fact, the 1969 moon landing did not impress Inuit elders. They simply said, “We’ve already been there!” (Minogue, 2005, n.p.). The word is also an example of how Inuit are mapping traditional concepts, values, and metaphors to make sense of contemporary realities and technologies.

It’s too perfect, no? — Michael.

A Brief History of Newspaper Lingo

In honor of the first publication of the NY (Daily) Times on September 18, 1851, The Week has some journalism lingo trivia for you. Lonely-hearts, for example, refers to a newspaper column (circa 1930’s) “in which people attempt to find friends of the opposite sex.”

Ever Wonder How … the Internet is Changing Your Typing?
Take a look at your inbox and pay attention to the number of ellipses (“…”) in your personal emails. Notice a lot of them?
Slate’s Matthew J.X. Malady did, and wanted to understand this “ellipsis overkill.” It seems, he writes, to be an influence of the immediacy of communication technology on the written language. Now, written language mimics speech, not the other way around. From Malady’s talk with Clay Shirky:

“[M]uch of what is typed is for swift delivery and has more the character of speech, where whole, unbroken sentences are a rarity,” Shirky says. “Speech is instead characterized by continuous flow, with lots of pauses, repeats, false starts … and pauses to indicate changes in direction. We’re living in a moment a bit like Alexander the Great’s time, when he adopted the altogether remarkable habit (or so Plutarch reported) of reading silently. The relationship between the alphabet and talking was progressively broken as people learned to sound things out in their heads. Now we’re seeing a moment of reversal, where people are trying to use alphabets like we’re talking, and it’s … hard. So we reach for the ellipsis.”

See what he did there, with the … ellipses?
Other explanations posit that the ellipsis is merely a lazy man’s punctuation mark, a shortcut in simplifying complex conversation, or a tool for concise writing. Read the whole essay here, and watch your ellipsis footprint!
Related: Other tech-influenced linguistic trends, including "slash" as conjunction, gendered Tweeting behavior, and the rules of texting.
Image: Graphic from Slate

Ever Wonder How … the Internet is Changing Your Typing?

Take a look at your inbox and pay attention to the number of ellipses (“…”) in your personal emails. Notice a lot of them?

Slate’s Matthew J.X. Malady did, and wanted to understand this “ellipsis overkill.” It seems, he writes, to be an influence of the immediacy of communication technology on the written language. Now, written language mimics speech, not the other way around. From Malady’s talk with Clay Shirky:

“[M]uch of what is typed is for swift delivery and has more the character of speech, where whole, unbroken sentences are a rarity,” Shirky says. “Speech is instead characterized by continuous flow, with lots of pauses, repeats, false starts … and pauses to indicate changes in direction. We’re living in a moment a bit like Alexander the Great’s time, when he adopted the altogether remarkable habit (or so Plutarch reported) of reading silently. The relationship between the alphabet and talking was progressively broken as people learned to sound things out in their heads. Now we’re seeing a moment of reversal, where people are trying to use alphabets like we’re talking, and it’s … hard. So we reach for the ellipsis.”

See what he did there, with the … ellipses?

Other explanations posit that the ellipsis is merely a lazy man’s punctuation mark, a shortcut in simplifying complex conversation, or a tool for concise writing. Read the whole essay here, and watch your ellipsis footprint!

Related: Other tech-influenced linguistic trends, including "slash" as conjunction, gendered Tweeting behavior, and the rules of texting.

Image: Graphic from Slate

English Originated Somewhere Around Here (Perhaps)
Researchers using methods similar to those for tracking the origins of viruses such bird flu and HIV pinpoint current day central and southern Turkey as the birthplace for languages as diverse as English, Icelandic, Bengali and Farsi.
Via the New York Times:

[Quentin Atkinson of the University of Auckland in New Zealand] and colleagues have taken the existing vocabulary and geographical range of 103 Indo-European languages and computationally walked them back in time and place to their statistically most likely origin…
…The researchers started with a menu of vocabulary items that are known to be resistant to linguistic change, like pronouns, parts of the body and family relations, and compared them with the inferred ancestral word in proto-Indo-European. Words that have a clear line of descent from the same ancestral word are known as cognates. Thus “mother,” “mutter” (German), “mat’ ” (Russian), “madar” (Persian), “matka” (Polish) and “mater” (Latin) are all cognates derived from the proto-Indo-European word “mehter.”

“If you know how viruses are related to one another you can trace back through their ancestry and find out where they originated,” Atkinson tells the Royal Society of New Zealand. “We’ve used those methods and applied them to languages.”
Back to the Times:

Dr. Atkinson and his colleagues then scored each set of words on the vocabulary menu for the 103 languages. In languages where the word was a cognate, the researchers assigned it a score of 1; in those where the cognate had been replaced with an unrelated word, it was scored 0. Each language could thus be represented by a string of 1’s and 0’s, and the researchers could compute the most likely family tree showing the relationships among the 103 languages.
A computer was then supplied with known dates of language splits. Romanian and other Romance languages, for instance, started to diverge from Latin after A.D. 270, when Roman troops pulled back from the Roman province of Dacia. Applying those dates to a few branches in its tree, the computer was able to estimate dates for all the rest.

The findings are “consistent with the expansion of agriculture into Europe via the Balkans, reaching the edge of western European by 5,000 years ago,” according to the Royal Society but counter a rival hypothesis that Indo-European languages originated much later from the steppe region north of the Caspian Sea.
Check the articles below for that debate. If the Anatolia region hypothesis holds the languages spread more or less peacefully with agricultural migration. If the “steppe hypothesis” holds, chariot-driving pastoralists conquered Europe and Asia and spread their languages with them:
New York Times, Family Tree of Languages Has Roots in Anatolia, Biologists Say
The Economist, The Tree of Knowledge
The Royal Society of New Zealand, Mapping the Origin of Indo-European
Image: Fairy Chimneys rock formation near Göreme, Turkey, via Wikipedia.

English Originated Somewhere Around Here (Perhaps)

Researchers using methods similar to those for tracking the origins of viruses such bird flu and HIV pinpoint current day central and southern Turkey as the birthplace for languages as diverse as English, Icelandic, Bengali and Farsi.

Via the New York Times:

[Quentin Atkinson of the University of Auckland in New Zealand] and colleagues have taken the existing vocabulary and geographical range of 103 Indo-European languages and computationally walked them back in time and place to their statistically most likely origin…

…The researchers started with a menu of vocabulary items that are known to be resistant to linguistic change, like pronouns, parts of the body and family relations, and compared them with the inferred ancestral word in proto-Indo-European. Words that have a clear line of descent from the same ancestral word are known as cognates. Thus “mother,” “mutter” (German), “mat’ ” (Russian), “madar” (Persian), “matka” (Polish) and “mater” (Latin) are all cognates derived from the proto-Indo-European word “mehter.”

“If you know how viruses are related to one another you can trace back through their ancestry and find out where they originated,” Atkinson tells the Royal Society of New Zealand. “We’ve used those methods and applied them to languages.”

Back to the Times:

Dr. Atkinson and his colleagues then scored each set of words on the vocabulary menu for the 103 languages. In languages where the word was a cognate, the researchers assigned it a score of 1; in those where the cognate had been replaced with an unrelated word, it was scored 0. Each language could thus be represented by a string of 1’s and 0’s, and the researchers could compute the most likely family tree showing the relationships among the 103 languages.

A computer was then supplied with known dates of language splits. Romanian and other Romance languages, for instance, started to diverge from Latin after A.D. 270, when Roman troops pulled back from the Roman province of Dacia. Applying those dates to a few branches in its tree, the computer was able to estimate dates for all the rest.

The findings are “consistent with the expansion of agriculture into Europe via the Balkans, reaching the edge of western European by 5,000 years ago,” according to the Royal Society but counter a rival hypothesis that Indo-European languages originated much later from the steppe region north of the Caspian Sea.

Check the articles below for that debate. If the Anatolia region hypothesis holds the languages spread more or less peacefully with agricultural migration. If the “steppe hypothesis” holds, chariot-driving pastoralists conquered Europe and Asia and spread their languages with them:

Image: Fairy Chimneys rock formation near Göreme, Turkey, via Wikipedia.

laughingsquid:

The Endangered Languages Project, An Online Initiative to Save Languages Facing Exctinction

FJP — Via the Google Blog:

The Endangered Languages Project, backed by a new coalition, the Alliance for Linguistic Diversity, gives those interested in preserving languages a place to store and access research, share advice and build collaborations. People can share their knowledge and research directly through the site and help keep the content up-to-date. A diverse group of collaborators have already begun to contribute content ranging from 18th-century manuscripts to modern teaching tools like video and audio language samples and knowledge-sharing articles. Members of the Advisory Committee have also provided guidance, helping shape the site and ensure that it addresses the interests and needs of language communities.

Google has played a role in the development and launch of this project, but the long-term goal is for true experts in the field of language preservation to take the lead. As such, in a few months we’ll officially be handing over the reins to the First Peoples’ Cultural Council (FPCC) and The Institute for Language Information and Technology (The LINGUIST List) at Eastern Michigan University. FPCC will take on the role of Advisory Committee Chair, leading outreach and strategy for the project. The LINGUIST List will become the Technical Lead. Both organizations will work in coordination with the Advisory Committee.

When Nouns Grew Genitals

Slate explores why many languages have masculine and feminine classes for nouns but not English. In this first of a series of podcasts on the roots of language, they try to figure out what gendered nouns mean for the way we look at the world.

The Life and Death of Words

Words, like plants and animals, fight for survival and an international group of scientists studying English, Spanish and Hebrew believe that many — in general — are dying off.

Their killer? Editors.

Via Statistical Laws Governing Fluctuations in Word Use from Word Birth to Word Death (PDF):

The modern era of publishing, which is characterized by more strict editing procedures at publishing houses, computerized word editing and automatic spell-checking technology, shows a drastic increase in the death rate of words. Using visual inspection we verify most changes to the vocabulary in the last 10–20 years are due to the extinction of misspelled words and nonsensical print errors, and to the decreased birth rate of new misspelled variations and genuinely new words.

The Guardian clarifies this a bit by killing off some difficult words of their own and getting straight to the point about how words live and how words die:

But it is not only “defective” words that die: sometimes words are driven to extinction by aggressive competitors. The word “Roentgenogram”, for example, deriving from the discoverer of the x-ray, William Röntgen, was widely used for several decades in the 20th century, but, challenged by “x-ray” and “radiogram”, has now fallen out of use entirely. X-ray had beaten off its synonyms by 1980, speculate the academics, owing to its “efficient short word length” and since the English language is generally used for scientific publication. “Each of the words is competing to be a monopoly on who gets to be the name,” [Joel] Tenenbaum told the American Physical Society.

The phrase “the great war”, meanwhile, used for a period to describe the first world war, fell out of use around 1939 when another war of equal proportions hit the world.

Takeaway: Language is a giant Darwinian battle for linguistic supremacy. Choose yours selectively. 

Video: MIT’s Erez Lieberman Aiden and Jean-Baptiste Michel illustrate what we can learn from analyzing 500 billion words via Google Books and its related Ngram Viewer which gives us the ability to enter words and phrases into a search engine in order to view their frequency over time.

Would You Like Some Technology with your Language?

A is for antivirus. B is for blogosphere. L is for LOL. And P is for pwned.

From lolcat to textspeak: How technology is shaping our language.

Speaking of lolcats, perhaps you’ve seen the LOLCat Bible Translation Project?

It kicks off, naturally enough, with Genesis, Chapter 1:

Boreded Ceiling Cat makinkgz Urf n stuffs

Oh hai. In teh beginnin Ceiling Cat maded teh skiez An da Urfs, but he did not eated dem.

Da Urfs no had shapez An haded dark face, An Ceiling Cat rode invisible bike over teh waterz.

At start, no has lyte. An Ceiling Cat sayz, i can haz lite? An lite wuz.

An Ceiling Cat sawed teh lite, to seez stuffs, An splitted teh lite from dark but taht wuz ok cuz kittehs can see in teh dark An not tripz over nethin. An Ceiling Cat sayed light Day An dark no Day. It were FURST!!

The Internets Made Us Do It

As the blame game continues over Jared Loughner’s rampage, New Republic Contributing Editor John McWhorter says fault lies with the Internet.

McWhorter, a Columbia lecturer specializing in language change and language contact, explains that we’ve moved from the introspection of analog writing to the narcissism of digital speech.

The actual cause of this new national temper is technology and its intersection with how language is used. Language exists in two forms in modern times: speech and writing. Writing is a latterly invention only some thousands of years old, produced and received more slowly than talk. It encourages reflection, extended argument (something almost impossible to convey amidst the overlapping chaos of conversation), and objectivity. Writing is, in the McLuhanesque sense, cool…

…It is no accident that the shrillness of political conversation has increased just as broadband and YouTube have become staples of American life. The internet brings us back to the linguistic culture our species arose in—all about speech: live, emotional, unreflective, and punchy. The slogan trumps the argument. Anger, often of hazy provenance but ever cathartic (“I want my country back”) takes fire. All of this is reinforced by the synergy of on line “communities” stoking up passions on a scale that snail mail never could.