Archive for technology

Social networking helping to preserve languages on brink of extinction

Source: justmeans.com

When people say that sites like Facebook, Twitter, and YouTube bring people closer together, it’s easy to think of it as a one-dimensional thing. However, these kinds of sites that help and encourage social engineering have a few other major benefits that you might not consider at first.

For example, of the some 7,000 languages that are spoken around the world today, half of those are expected to be extinct by the year 2100. The cause of this sad fact is often cited to be globalization and the fact that common language is the only real way for otherwise disparate cultures to come together.

However, sites like Facebook are helping speakers of the more minority (and thus endangered) languages find their voices. People like Professor K David Harrison, an associate professor of linguistics at Swarthmore College and a National Geographic Fellow, explains the phenomenon:

“Small languages are using social media, YouTube, text messaging and various technologies to expand their voice and expand their presence. We hear a lot about how globalisation exerts negative pressures on small cultures to assimilate. But a positive effect of globalisation is that you can have a language that is spoken by only five or 50 people in one remote location, and now, through digital technology, that language can achieve a global voice and a global audience.”

Whilst plenty of minority languages will still die off in the years to come, it’s good to know that technology has brought us all a way to connect with others, no matter what languages you speak.

Comments

Linguistic experts working on way to identify internet trolls

Source: bbc.co.uk/news

The original definition of an internet troll is somebody who purposefully posts something inflammatory or purposefully incorrect online in order to gain the attention and ire of fellow internet users. However, these days it is used in a general way to describe anybody who posts malicious or offensive on the internet. Trolls tend to target places that are easy to sabotage or have a large audience that are otherwise sympathetic, like Wikipedia, or Facebook memorial and tribute pages for deceased people. Due to the anonymity of the internet it is very difficult to locate and prosecute offenders.

This has sparked linguistic experts at the university of Central Lancashire in England to begin development on an automated system to track and identify certain word patterns and vocabulary often used by these malicious users. From the article:

Claire Hardaker, lecturer in linguistics and English language at UCLAN, said: “Everyone has a unique way of writing, of putting certain words together, which is subconscious.

“Many teenagers say they are able to identify who sent a text to them – just by the style of writing and word habits or the way the words are written.

“Someone might be pretending to be someone else, but by analysing the way they write online, we can determine a probable, age, gender – even a probable region from where they come from.

“In its simplest form, people use different words for things – for example a bread roll. Some people would say a tea cake, some people would say a barm – it is these sort of elements that help to narrow down a search.”

It is proving hard for authorities to trace so-called trolls and there have only be two people in England successfully prosecuted.

In related news, next Tuesday is Safer Internet Day, an annual event with the purpose of encouraging people to be safer online. Remember: don’t feed the trolls.

Comments

I can has academia? A thesis on “lolspeak”

Source: etd.lsu.edu

There are some who would think that your internet privileges should be revoked should you never have run across lolcats – photos of cats, often in humorous positions, captioned in what seems like a rudimentary form of English. One of the most famous lolcats, rather overweight specimen of a cat, bears the caption “I can has cheezburger?”, which helped spawn not only a plethora of other lolcat images, but also the website icanhascheezburger.com (which has just celebrated its five-year anniversary), where people can submit their own.

lolcats are a prime example of what is known as a meme: an image or concept that spreads rapidly from user to user over the internet (also known as a “macro image”). There are many of these, and to the unfamiliar their names seem very odd indeed, for example: “socially awkward penguin”, “good guy Greg”, “nope! Chuck Testa”, and “internet husband”. Confused? Luckily there is a site that seeks to help out those dazed and confused by the huge variety of memes that have gained popularity or notoriety – knowyourmeme.com.

However, it seems that in the 5 years since the meme first became popular online and spawned thousands of copycats (sorry), there is more to “lolspeak” than meets the eye. So much so, in fact, that Jordan Lefler, a student at the Louisiana State University, has published a 129-page thesis entitled “I can has thesis?: A linguistic analysis of lolspeak”. In the thesis she explores “lolspeak” from its very roots as an image meme, to its evolving linguistic properties. While it may seem to be a randomly warped version of English, it turns out that “lolspeak” – kind of like ebonics – is an artificial dialect in its own right, and actually has its own syntax, formulas and punctuation rules, the proof being that there is enough structure there for somebody to write an entire thesis about it.

You can read the entire thesis for free on the lsu.edu site.

Comments (1)

Study shows better readers rely on a ‘visual dictionary’ to read quickly and accurately

Source: medicalxpress.com

I don’t usually refer to medical documents on this blog, but I thought this was a fascinating discovery from the neuroscientists at Georgetown University Medical Center (GUMC), and well worth linking to. Their studies showed that readers who are able to read especially quickly are relying on a ‘visual dictionary’ in their heads, which helps them immediately recognise common words. These findings are contrary to the long-held belief that our brains work on phonics, ‘sounding out’ words while reading in our heads.

How exactly did they discover this? Through a series of fMRI scans performed while test subjects were reading texts, and keeping track of which neurons were firing when each word was encountered.

Glezer and her co-authors tested word recognition in 12 volunteers using fMRI. They were able to see that words that are different, but sound the same, like “hare” and “hair” activate different neurons, akin to accessing different entries in a dictionary’s catalogue. “If the sounds of the word had influence in this part of the brain we would expect to see that they activate the same or similar neurons, but this was not the case, ‘hair’ and ‘hare’ looked just as different as “hair” and “soup”. This suggests that all we use is the visual information of a word and not the sounds.”

This reminds me somewhat of the well-known study performed by researchers at Cambridge University, wherein they showed that so long as the first and last letters of a word were recognizable, you could scramble the other letters in the words of a sentence and the brain can still comprehend the meaning. For example: “Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae.”

“One camp of neuroscientists believes that we access both the phonology and the visual perception of a word as we read them and that the area or areas of the brain that do one, also do the other, but our study proves this isn’t the case,” says the study’s lead investigator, Laurie Glezer, Ph.D., a postdoctoral research fellow. She works in the Laboratory for Computational Cognitive Neuroscience at GUMC, led by Maximilian Riesenhuber, Ph.D., who is a co-author.

“What we found is that once we’ve learned a word, it is placed in a purely visual dictionary in the brain. Having a purely visual representation allows for the fast and efficient word recognition we see in skilled readers,” she says. “This study is the first demonstration of that concept.”

This study also gives a somewhat more elegant explanation for dyslexia – the brains of dyslexic people have a much smaller or less effective ‘visual dictionary’, and so they generally find reading a slow and laborious process – especially for words that they haven’t come across before. However, due to the findings of this study, it could be possible to help improve these skills at a younger age and thus offset the reading difficulties experienced by those with dyslexia.

Comments

How psychopaths speak

Source: dictionary.com

The word ‘psychopath’ is thrown around plenty on TV, but few may know the true definition of the mental disorder. Essentially, it’s an inability to empathize with others or establish any kind of meaningful relationship. However, this often means that a person exhibiting psychopathic behavior fits a certain pattern of other traits: extreme egocentricity, a failure to learn from experience, and a tendacy to treat other people as a means to further their own ends, rather than individuals in themselves.

Here’s an interesting article on a study recently performed by Jeffrey Hancock, a professor of communications at Cornell University. By analyzing the way convicted murderers speak – the words they use, the most common patterns of speech, etc. – they can build a model of ‘psychopathic’ language, and apply it to the real world. From the article:

They [psychopaths] tend to see people as means to their own ends, rather than as individuals. These emotional abnormalities manifest in their speech patterns in a few interesting ways. The psychopaths who were interviewed tended to use a lot of causal phrases like “so” and “because.” The researchers interpreted this to mean that they were explaining their crimes away as a “logical outcome of a plan (something that ‘had’ to be done to achieve a goal).’” In contrast, other convicted criminals who are not psychopaths tend to use more language around religion and their own guilt when describing their crime. The researchers observed other aberrations in psychopaths’ speech. Psychopaths in the study spoke of basic needs like food and money twice as much as the other subjects in the study, and they also use more disfluencies (phrases like “uh” or “umm”) to break up their speech.

The implications of this study mean that police could be able to build a sound psychological profile of people from the language used in their Facebook statuses or Twitter updates, or any posts on public sites like Craigslist, forums, and the like.

Comments

A question of Yoda’s grammatical consistency, it is

Source: reddit.com/r/linguistics

I make no secret on this blog of my fondness (perhaps bordering on obsession) with social bookmarking site Reddit. What sets apart Reddit from other similar sites is the quality of its community: unlike the comments you’ll find in other online communities such as YouTube, the ability to ‘upvote’ interesting, thought-provoking articles and comments means that the best stuff always floats to the top. Also, given the sheer depth and breadth of Reddit’s userbase, any question you find yourself asking probably has at least a handful of people knowledgeable enough about the subject to help you.

Such was the case when I was recently browsing the linguistics section (or subreddit, to use a redditor’s parlance) of the site, and found that user Shakedown_1979 had raised a very interesting question about everybody’s favourite little green man, Yoda from the Star Wars franchise:

What is Yoda’s syntax in foreign dubs/subtitles in Star Wars?

What does Yoda’s syntax look like in non-English versions of Star Wars? For those who aren’t familiar with Star Wars (all two of you), Yoda is an alien who, when speaking English, uses what seems to be an OSV syntax instead of the traditional SVO syntax.

So how do foreign translations of the script handle this? I am particularly interested in what it looks like in non-SVO languages. Are there any translations where Yoda’s incorrect syntax is emulated by using an English-like syntax? Or are other languages’ syntax so free that mistakes in the use of case or verb conjugations must instead be used to emulate Yoda’s “alien” speech?

Put in simple terms, does Yoda muddle up his words in translated versions of Star Wars? This raises another query: since some languages are much more free with word order than the stricter subject-verb-object (SVO) syntax in English, would muddling up Yoda’s speech have the same affect?

Many comments followed from users all over the world, who shared their experiences of Yoda’s speech patterns from watching their country’s dub of Star Wars. The more detailed results follow (thank you again to Shakedown_1979, not only for asking the question but also for collating and listing the results so neatly!).

The overall answer is that thought has clearly gone into ‘translating’ Yoda for foreign audiences. While the word order may not always be the defining characteristic of his speech, in most foreign versions of all the Star Wars movies, he retains linguistic oddities that set him apart from everybody else.

(Note: S = subject, O = object, V = verb. English is a SVO language, in that we say “The cat (subject) sat (verb) on the mat (object)”. The list below denotes what the usual word order is in that language, if any, and whether Yoda follows the same template.)

  • Czech: A free word order language. Yoda speaks consistently in SOV. Interestingly enough, putting an object before a verb does sound unusual to most speakers of Czech.
  • Estonian: A free word order language. Yoda retains the English OSV order. This is grammatical in Estonian, but does make it seem as though Yoda is constantly stressing the object phrase as the main point of his statements. This gives his speech an unusual quality.
  • French: An SVO language. Yoda speaks in OSV.
  • German: An SVO or SOV language. Yoda brings the Object to the front (OSV), like in English.
  • Hungarian: A free word order language. There is nothing unusual about Yoda’s speech.
  • Italian: An SVO language. Yoda speaks in OSV. Note: OSV is also the syntax used in the Italian of the less-proficient speakers of Italian from the region of Sardinia.
  • Japanese: An SOV language. Yoda seems to use a more or less correct syntax, with a more archaic vocabulary.
  • Korean: An SOV language. Nothing is unusual about Yoda’s grammar.
  • Norwegian: An SVO language. Yoda speaks in OSV.
  • Romanian: An SVO language. Yoda speaks in OSV. He also places adjectives before the noun instead of after the noun, and uses an archaic form of the future tense.
  • Spanish: An SVO language. Yoda speaks in OSV.
  • Turkish: An SOV language. Yoda speaks in OSV. Note: This order is also used in classical Ottoman poetry, so the syntax may have been chosen in order to emphasize Yoda’s wisdom or age.

Comments (3)

Another TED talk: Deb Roy and recording his infant son’s every waking moment

Source: ted.com

When MIT cognitive scientist Deb Roy decided that he wanted to know how his infant son picked up language day-to-day as he developed, he went a little further than most. Rather than observe what he could, he decided that the best course of action would be to observe everything, and so he set up fish-eye cameras in every room of his house in order to document how his son dealt with and learned language.

For five years, starting from the very day the newborn baby was brought home from the hospital, the activity in each room was recorded and logged, and over 200 terabytes (200,000 gigabytes) of data subsequently parsed in order to understand how words developed from incoherent gagas and coos to concrete words like “water” and “ball”.

The talk, entitled “Birth of a Word”, is well worth watching for anybody interesting in linguistic development. Some of the technology used is unbelievably impressive.

Comments

Harper Collins eBooks in libraries only good for 26 reads

Source: guardian.co.uk

In a move that simply reeks of maximizing profits in an increasingly digital age, publishers Harper Collins have decreed that their eBooks can only be borrowed 26 times before they have to be replaced. Their reason? Because apparently 26 is the magical number that represents that average number of loans before an actual book has to be replaced by the library.

Their official statement on the issue reads:

“HarperCollins is committed to the library channel. We believe this change balances the value libraries get from our titles with the need to protect our authors and ensure a presence in public libraries and the communities they serve for years to come.”

So, real reason? Money.

As a lover of eBooks (though I still believe nothing really comes close to the real thing), I see this move as pure greed on the publisher’s part. 26 also seems like a crazy number, and librarians across the US are in full agreement. In this video, 2 Oklahoma librarians put this to the test, to see exactly what toll 26 checkouts took on a book. Their findings are unsurprising.

It would be better for everyone if publishers would simply admit that they need to find a way to gain repeated revenue on a format that is essentially indestructible. As one of the comments on the YouTube video says, a tiered pricing system could work – through which the library pays a fee for every x times the book is borrowed, depending on its popularity.

If this kind of “rule of 26″ was instituted on real books (i.e. libraries would have to restock the book every 26 times it was lent out), there would be an uproar – and libraries would fade away faster than they already do. Besides which, what would they do with all the ‘spoiled’ books? As a company that prides itself on their ‘innovation’, Harper Collins really seem to be giving the finger to libraries that cater for electronic book users.

Some people are choosing to boycott Harper Collins as a measure to demonstrate their disapproval at this decision. Mike Masnik on technology blog Techdirt has said of the decision: “Yes, seriously. They think they need to protect authors from libraries. That’s – to put it frankly – insane.”

I couldn’t agree more.

Comments

Patricia Kuhl: the linguistic genius of babies

Source: ted.com

Today I watched an absolutely mindblowing video on ted.com, a site of small, non-profit group TED, who are devoted to “ideas worth spreading”. There is a goldmine of interesting stuff in their archives, but being somebody with a keen interest in languages this video seemed truly worthy of sharing.

The talk is given by Patricia Kuhl, co-director of the Institute for Brain and Learning Sciences at the University of Washington and expert on early language and brain development. We all know that babies and young children are better at picking up new languages than adults, but perhaps we forget just how much better. Her studies have shown that during early language learning, they can do things adults simply can’t, and to surprising degrees.

You can view the video below, or go to the page on ted.com.

Comments (1)

To doublespace or not to doublespace

You may have noticed that some people like to tap the space bar twice after finishing a sentence. Why do they do this?

The double space after a period habit comes from the days of typewriters. The problem with typewriters – besides error correcting, of course – was that due to technical limitations, the text was monospaced, also known as fixed-width or non-proportional. That is to say, unlike when you type on your computer, a character like the letter l or i takes up just as much horizontal space as a wider letter like a D or H. This made things look a little messy sometimes, due to there being a lot of white space between certain letter combinations, and not so much between others. Therefore it became a habit for typewriter users to insert two spaces after a period, to make sentence breaks a little clearer.

With the exception of certain monospaced fonts like Courier, modern day word processing has allowed for proportional fonts, which are easier to read and nicer looking. Since typewriters are never used these days, and since most people write with proportional fonts (with the exception of programmers and web designers, since computer code can be easier to parse when everything can be easily arranged in rows and columns), the double space habit has been phased out. However, there are still people who follow the old faithful – usually older than 30 – who insist that periods should be followed by two spaces, not one.

Don’t listen to them: it’s an archaic habit. Most modern word processors will automatically adjust the size of a space following a period to be slightly larger than the space between words anyway – some will even go so far as to correct a double space to a single space.

Comments

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »