The Third Portland Data Visualization Group | Thurs, April 29, 2010: 6:30–9Pm at Webtrends

It’s time for another Portland Data Visualization Meetup. The last one occurred on November 3rd, 2009. We’ll have five main presentations and networking time. Webtrends will again graciously host us on their top floor. Unfortunately, there will not be beer. If you want to, you can bring your own. Or if you know a company who could bring some, let me know.

Schedule

1. StreamGraphs for Visualizing Twitter searches

Amber Case @caseorganic
Background of streamgraphs for visualizing Twitter searches

  • Use case
  • Limitations

Aaron Parecki @aaronpk

Data collection from Twitter

  • Twitter whitelisting
  • Storing in MSQL
  • Preparing data (removing words) for Python library

Nathan Bergey @natronics

  • Implementing the streamgraph algorithm in Python
  • Open Source, Python
  • How it works
  • SVG library

Conclusions, analysis and highlights
Amber, Nathan Aaron

>>Break<<

2. Ben Stabler

  • Flex/Flash data visualization platform
  • R visualization

3. Nathan Bergey

  • Gource (video)
    • Program for visualizing commit history in a git-based code project.

4. Aaron Parecki

  • Subversion Commit Logs
    • Showing work habits through visualizing the two years of source code logs.
  • GPS Map of Portland
    • Showing where Aaron has been in Portland from October 2009-April 2010.

5. Data Viz of Cyborg Reconstruction Fund

Data Visualization for PayPal donations to @caseorganic’s ankle surgery fundraiser.

  • PayPal Data
  • MySQL table
  • Google Maps/Chart code
  • Final Images


Who Should Go?

The event is open to everyone interested in or working in the field of data visualization. This means designers, programmers, information architects, data miners, anthropologists, ect. We’re expecting a similar amount of people to last time (probably around 20-30 people).

Location and Time

Webtrends

851 SW 6th Ave.
Portland OR 97204
(map)

RSVP on Upcoming or view the event on Calagator.

Note to newcomers: If you haven’t been to Webtrends before, you might have a difficult time gaining access to the building. Please E-mail me for detailed instructions on how to enter the building, and a phone number you can reach to gain access once inside.

Google Group:

Ed Borasky started a Google group called pdx-visualization. As the name implies, it is a group for Portland-area people interested in languages and techniques for visualization of data. http://groups.google.com/group/pdx-visualization.

Innovation in Data Visualization Group on Flickr:

I’ve been collecting interesting data viz photos for a while now and posting them to Flickr. They’re all accessible on my Flickr account in this set. Most pictures contain descriptions and links to the viz sources. If you have any Flickr photos of data viz work you’ve done, or work your find innovative, be sure to add them to the group!

Also check out Aaron Parecki’s GPS Logs and Data Visualizations on Flickr.

Hope to see you all there!

——

About

Amber Case, (@caseorganic) is a Cyborg Anthropologist studying the interaction between humans and computers and how our relationship with information is changing the way we think, act, and understand the world around us. She’s obsessed with compressing the space and time it takes to get data from one place to another, especially when the final destination is the mind.

Reblog this post [with Zemanta]

April Fool’s Hits Hard, Thousands Perish

A.I. Has Taken Over My Life

CADIE: the world’s first “artificial intelligence” tasked-array system, and CADIE’s Gmail Autopilot System destroyed the future of multiple sectors of workers, including myself.

As an overworked and uber-busy Cyborg Anthropologist, I’m left to simply let Artificial Intelligence write my books, publish them, and even go on book tours in place of me. Is it a better world because of this? I think not – I’d rather not wake up every morning to see a new pile of work done in my name. But every time I try to stop the AI, it tells me that I ‘shouldn’t do that’, so I’m left to sit in my room, doing nothing, while the AI does everything for me.

It even thinks my thoughts for me now — and is ever writing this sensence.

Future of Web Work

And what has this new AI done to the rest of the web workers? I guess they’re out of work too – but I don’t know, because ever since I set my Gmail to CADIE’s Gmail Autopilot System, I’ve been able to sit back and watch my Google calendar fill up with guest speeches and keynotes, and the AI is going in my place.

AI Learning

But how does this AI learn all of this? I think the answer might be that CADIE is reading all of our Slideshare presentations got today.That fiesty panda has to learn how to sound slighly human from something, right? Although I’m not sure if she can parse The Onion yet.

Reblog this post [with Zemanta]

Data Flows and Crises in Online Reputation Economies

Prior to network culture, traditional news outlets were the first reliable source for news concerning major events. This was because traditional news media outlets have established reputations for providing a certain level of credibility and reliability.

In a global, ever-connected economy, it is finally possible to rely on citizen media outlets to receive news almost as soon as it happens, however, people often have a limited basis on which to determine validity. Online, time and space for information gathering is compressed. This also means that time and space for decision making is also reduced. This is why online social networks try to use online metrics to establish validity in as short amount of time as possible.

Take, for example, critical situations like wars, attacks, accidents or natural disasters:

• In emergency situations, traditional media sources are often too slow in providing clear, relevant information.

• In delicate political environments, standard news outlets are often blocked from transmitting relevant information.

These situations call for non-traditional data points. These data points exist in the form of social nodes in networks. The wired, network of the online world allows anyone close to the news source to have the same power as ones with bigger budgets, bigger political power or better transmission equipments like a traditional news source. Reputation in critical moments like these (such as earthquake reporting, or terrorist attack information and safety instructions) must be negotiated almost instantaneously. Unlike traditional offline news identities, there are no presuppositions of identity.

In a space where news sources are both distributed (in both sense of the word: “distribution of power” and “fragmentation”) and largely anonymous, reputation becomes the sole metric for validity. This is the problem that this paper tries to address.

Reputation is extremely complex. There is no single way to define it:

• It can take the form of a hyperlink between two places, abilities, or powers. In other words, reputation is a way of describing the link between two entities.

• It can be transitory, especially online, where reputation serves as a social construction only as long as it’s needed, depending on data flows, proximity to events, or distance between individuals. In other words, reputation is a dynamic system of situated knowledge that sorts social interactions.

• It can be a handshake, in a sense that both parties must agree to open up and exchange something valuable for a trust-relationship to happen. In the business realm, for instance, this action has been formalized in the act of exchanging business cards.

• It can be measured or tracked as an overlay on a series of data points showing relations and trust.

• It can be measured or tracked as factors that individuals share in common. More shared things will lead to more shared beliefs, value systems and judgment, and generally could better reputation.

Measuring Reputation

A new metric is thus needed in order to quickly determine credibility and reputation in the event of a crisis. Note that this paper does not aim to search for and establish the most accurate metric, but rather, one that provides the user with an idea about the situation, then leaves the ultimate value judgment in her hand. In other words, to be both economically and timely achievable, the metric has to have enough ‘fuzziness.’

“What you want is a durable perception of person”, says programmer Anselm Hook, “one that allows one to quickly understand whether a piece of information from a source is reputable or not in the fastest way possible”. One way is to wait for a backup vote. Robert’s Rules of Order say that a statement must be seconded before it can be voted on by many. But in some cases, waiting for a second is difficult, because there may be only one person next to a data source or event that is capable of reporting it.

In order to determine a valid metric, one must define a few key elements of the online experience:

Interest and Power

Power is created by interest. This is the most easily observed in online environments, where the creation of value and interest is most fluid. The fluidity of value creation and exchange.

Interest Groups

One could call an interest group a demographic. Demographics are those with specific lifestyles that influence interest, and also support those who create products or services that fulfill these interests.

Crises and Social Networks

During a crisis, interest groups tend to converge upon a single topic or news source. The creation of validity in a news source in an online social network is often very fast, and generally not a traditional news source. Network users who were formerly low-level nodes can suddenly become major nodes of traffic if they begin to provide data that has proxemic, relational, or newsworthy value.

Those nodes that can provide the fastest information have tremendous power over those who have recently turned to follow them.

Point A marks the status of normal social network conditions and interest groups.

B marks the first appearance of crisis in the social network.

C signals the ramp-up of information awareness among social groups not in the social interest group of the initial reformers.

At point D, the crisis becomes a topic of collective interest. Networks of trust re-broadcast the news to un-informed groups until the network is saturated with information from all groups capable of absorbing the information.

At E, the discussion of crisis decreases due to crisis resolution of exhaustion of topic. The crisis falls out of common interest and formerly melded interest groups diverge once again.

F marks the final resolution or disappearance of the crisis. The crisis falls almost completely out of social network conversation.

One of the problems with social networks during crises is quickly finding the nodes with the most valuable information a voice in an efficient way, and promoting them to the top of a social network so that all that need that information can find it.

Micromeasurement

On May 11th, 2008, a earthquake that measured 7.8 on the Richter scale hit China. Several of those who experienced the earthquake Twitter user @dtan Tech Reporter Robert Scoble was able to rebroadcast the message to (at the time) approximately 40,000 followers.

But how did Robert Scoble know that @dtan’s Tweets were valid?

Was it the architecture of Twitter? A trust economy, established by the rapid exchange of everyday data on Twitter helped to. But Scoble’s reputation process takes a while.He has to first follow @dtan and through direct or indirect exchange determine the user’s reputation to report on an emergency.Of course, later on, additional reports from other people in China who also experienced the quake arrived. But it took CNN hours later to report on the event. This demonstrates the agility, relevancy and accuracy of non-traditional nodes as news sources.

As an aside, tools such as Google’s translation engine allowed @dtan’s Tweets, which were written almost entirely in Chinese, to be translated into English, and passed on to a more global

Improving Data Flows in Crisis

All individuals have social bases. There are an increasing number of individuals who use social networks as social bases. However, these bases are not necessarily the same. Social networks record relationships in different ways.One who uses the photo-sharing website Flickr as a social base interacts with data differently than a Facebook or Twitter user. Robert Scoble was able to transfer authority and power to @dtan very quickly, but rapid, local news of the earthquake was constrained to Twitter.

There was no system that looked at Twitter as a database and pulled out information. Neither was there a system that added Twitter’s earthquake updates to other relevant information coming out of mainstream news sources and other social networks.

To improve data flows in crisis, there is a clear opportunity to transcend data silos and aggregating data streams into a more accessible and unified databases, so that users of different social networks, or limited social networks, can quickly access relevant information.

This calls for either:

• The establishment of an open standard for disaster reporting across networks.

• The use and appropriation of existing open standards for reporting.

For instance: the DiSo project is an initiative to facilitate the creation of open, non-proprietary and interoperable building blocks for the decentralized social web.

Another other alternative (besides traditional media) is to rely on many ‘Scoble’s’ on each social network who talk to and inform each other on current happening at all times. This is highly impractical and very costly.

——————————

Additional Sources:

For more information and a full analysis of the Twitter Earthquake reporting, please visit: http://onlinejournalismblog.com/2008/05/12/twitter-and-the-chinese-earthquake/

Search Engine Reputation Management

——

About

Amber Case is a Cyborg Anthropologist and Tech Consultant from Portland, Oregon. She studies the effects of technology on the ways in which communities are built both off and online. You can follow her on Twitter @caseorganic.

Wordtracker Experiment | Finding Out What People Search For

Wordtracker Labs Logo

People are searching for things all the time on the web. If you’re a blogger looking to write good content, it is a good idea to get out there on the net to find what people are searching for. There are a few tools for doing this, but I wanted to isolate one of them and play with it for a minute.

@marknunney posted a link on Twitter about a “new tool from Wordtracker for content ideas”, so I clicked over to the site and read the following:

“People often type complete questions into search engines: if you find these questions and answer them, you could get some great search traffic”.

Below it was a box for entering in a word, so I tried a few words out. The results were amusing enough for me to want to share them. Further analysis follows.

Find Questions People are Asking

Results for Life

I found the results for the word ‘life’ to be what one might expect. Right now, people are wondering about life insurance. However, the ‘color of life in ancient egypt’ is something that is phrased in such as strange way that it could warrant further research — especially since it was looked up 156 times. At #6, ‘what is the meaning of life’ is asked. I guess life insurance and ‘how time of my life was chosen for american idol’ were more important.

Life Wordtracker Results

My personal favorite is #13 — ‘how to summon a real life dragon’. I bet if someone were to write a post on that, they’d get lots of hits. Maybe lots of Diggs too. I’m not sure how I’d go about researching that one. It’s probably better than trying to write a post on #8 — ‘how to ruin someone’s life’.

Results for E-mail

How does E-mail work? Apparently people are asking this question. But there is an important trend happening elsewhere in these question results. That would be the address of one (or rather two) ‘cole sprouse’. They happen to be identical twins, and are, according to the Cole Sprouse Wikipedia article, “known for their roles in the film Big Daddy…and for portraying the title characters on Disney Channel sitcoms”. Good luck finding their E-mail address, as well as the address of Prince Harry, Zac Efron and Jamie Spears.

E-mail Wordtracker Results

But you can write about how to E-mail pictures, or #9’s ‘who invented E-mail’. That one actually seems particulary interesting. The narrative histories of everyday things are always a joy to read about.

Results for Business

‘How to write a business plan?’ Can’t one just download a template from Microsoft Word or something? That question is really a broad one. It depends on what kind of business one wishes to start. #3’s ‘how to start a cell phone business’ is pretty good. #10’s ‘what is the best business opportunity’ is a really intense question that cannot totally be answered. #11’s ‘how to start a web design business’ is actually very answerable by a variety of sources such as Design Float and Smashing Magazine.

Business Wordtracker Results

#8’s ‘how to start a business with no money’ is interesting. I think it’s never been easier — and more difficult. It’s probably time that’s the big issue. Taking a lot of time really works. A cell phone business might manifest as an online reseller of cell phone accessories.

Results for Google

I was confused by these search results. I didn’t think they’d be this broad, or this ill-informed. Are Google founders Larry and Sergey that obscure? I wonder what sources Wordtracker is using for its search queries.

Google Wordtracker Results

It might also be interesting to create a post on the founders of Google just to see what happened to it. I’m sure Wikipedia and Google have this question answered already.

Results for Yahoo

I queried Yahoo! just to see what would happen. Very similar to Google’s results, except there was a question of what Yahoo stood for. I’m actually wondering that myself right now (goes off to find the answer).

Yahoo Wordtracker Results

The answer as to what Yahoo! stands for comes from About.com’s ‘Internet for Beginners’. The answer is that “Yahoo! (spelled with an exclamation mark) is short for “Yet Another Hierarchical Officious Oracle”. Apparently, “The original name: “David’s and Jerry’s Guide to the World Wide Web”, was appropriate, but not exactly catchy“. You can read the rest of Paul Gil’s Yahoo! article for the whole story.

Results for Money

The search results for money really surprised me. I had no idea that so many people wanted to know about ‘what presidents are on money’. I wonder what demographic asks this question the most. Is it youth? Is it due to a bet? Is it a homework assignment? Perhaps it is to clarify the use of slang words.

#2 and #3’s ‘how much money does it cost to open a bar’, and ‘how can kids make money’ are interesting. I’m wondering if more kids than parents searched for that phrase and if there a way to tell. As for #2, a lot of people seem to dream of owning and running their own bars.

Money Wordtracker Results

I thought that #7’s ‘how to make money’ would be higher up on the results than that, but apparently the presidents on money trumps that. It’s also a much easier niche to write for than the seedy ‘how to make money’ post. #8 and #9’s ‘millionaires who give money to help’, and ‘millionaires who give free money’ make a lot of sense. Those questions make me wonder how many millionaires out there actually give money out to strangers who ask for it over the Internet. Generally, processes and charities are involved. Darn! Perhaps Google or a blogger will write about another way?

Results for Puppy

Puppies. They’re somewhat irresistible. So irresistible that a lot of people question just how large they’re going to get, apparently. It would be useful to make a site that gave information on how large any breed of puppy was going to grow. It would be complete with a puppy weight calculator, to answer question #3 as well. One would simply have to enter in the breed and the age of the puppy and the Internet robots would do the rest. Hooray for calculators.

Puppy Wordtracker Results

I wonder how many new pet owners typed in #7 after watching their new puppy pee on their freshly installed carpet? Is there any way to tell? Perhaps they should’ve just stuck to drawing a puppy instead of owning one, like those who searched for #4’s ‘how to draw a puppy’.

Results for Read

This was wild. I did not expect to get results on horoscopes, tarot cards or reading palms. I’m not sure what I expected originally, but it wasn’t this. I thought people liked books more than daily horoscopes. My college experience has given me some explaination for these results. Everyone in my dorm was obsessed with reading their horoscopes to each other. Some of them even printed out astrological charts. I didn’t participate.

Read Wordtracker Results

But result #6 makes a lot of sense in this respect. Aside from horoscopes, ‘To Kill a Mockingbird’ is one of those books that’s assigned to the majority of school districts across the United States. I’m assuming a lot of kids didn’t want to purchase the book, lent it to someone else, forgot it at home or at school, or were looking for a quick way to read a chapter before an annoying quiz after lunch or homeroom period.

Results for RSS

What *is* RSS? Oh man. It is probably the greatest thing since the last iteration of really cool stuff that people enjoyed. It allows the quick and easy access of content without having to browse for it. I recommend watching RSS in Plain English instead of searching for RSS in Google. It’s an extremely short video by the Common Craft show. Totally sweetopian.

RSS Wordtracker Results

Hmm…#13’s ‘how do i find my twitter rss fed url’ is curious. Not only is fed spelled incorrectly, it is searched for 7 times. I’m sure there’s a great tutorial on this out there somewhere. …Or is there? I suppose that’ for random people to find out.

Results for SEO

These were not surprising. ‘What does seo stand for?’ Search engine optimization, of course. How does one ‘become a certified seo?’ Well gee whiz, that’s a hard one. Probably from showing it on your own site, and the sites of your clients. And by not selling links from bad sites. Also, by educating people thoughly about your techniques.

SEO Wordtracker Results

‘How to set up seo?’ Go through a standard checklist on your website, checking for alt tags, title tags, a sitemap, ect. I like the free Website Grader from HubSpot for a really quick website check and grade.

Results for Six

I wondered about numbers next, so I checked out an the number six. Purely arbitrary (by arbichance? arbitration?). I was amused to find such a long phrase at the top of the question results. ‘The six basic fears and how to eliminate them’. That is totally a book by Sharry Harris. #2’s ‘what are the six terms of geography’ totally sounds like a query taken directly from a homework assignment.

Six Wordtracker Results

#3’s ‘what are the six parts in a business letter’ has reminded me to re-examine my business letters for the correct number of parts. Perhaps I can do that while developing a six pack while using Six Sigma techniques.

Results for Twitter

‘What is Twitter?’ Ahh…if only there were an easy way to explain that. ‘How to use twitter’ is even more complicated. See, it is the emptiness of a vessel that gives it use-value, and Twitter is an empty vessel. The question of ‘What are you doing’ is never fully answered. Thus, how to use Twitter is like telling someone how to use a vase. The emptiness gives it many uses, whereas a tutorial can only give a finite amount of use-cases.

Twitter Wordtracker Results

Finding the Twitter RSS feed URL is another matter. Simply scroll down to the bottom left corner of your Twitter page and click on RSS. Or you can right click to ‘copy the address’ to place it elsewhere with ease.

Results for Unicorn

So Unicorns are very important to the state of the world. They give us a fantastic antithesis with which to view things. I assumed that I would get different results because of this mindset, but I did not. The number one search for unicorn relates to finding free Unicorn pictures to color. That is a total let-down.

Unicorn Wordtracker Results

I guess people are interested in drawing Unicorns, though. Perhaps they’ll make some awesome viral Unicorn videos when they get older. Like Charlie the Unicorn.

Final Verdict

WordTracker’s new tool is pretty fun, but I’m not sure how terribly useful it really is. I think you’re the judge for that.

————-

Amber Case is a Cyborg Anthropologist and Internet Marketing Consultant from Portland, Oregon. You can follow her online @caseorganic.

Social Network Spaghetti | Portland Web Innovators at Vidoop

Tonight was an event associated with Portland Web Innovators called Social Network Spaghetti. It happened at 7pm at Vidoop in Downtown Chinatown.

Adam Duvander started off by explaining that Portand Web Innovators is around three years old now. That makes it one of the cornerstones of the Portland Tech scene.

Scott Kveton told us his ideas on the state of current social networks. His charisma and ability to explain and parse complex ideas, systems, and trends was interesting and enjoyable to watch. I estimate around 30+ people showed up, and many interesting questions were raised from the audience.

Oh yeah…and there was lots of Bacon.

In case you missed it, the entire event was archived. Yes — every moment of the presentation can be viewed, thanks to @brampitoyo and @maestrojed.

Video chat rooms at Ustream

Scott Kveton is a digital identity promoter, open source contributor, and VP of Open Platforms for Vidoop.

>>

Check out more Scott Kveton

Scott Kveton’s Blog
BaconGeek
Twitter Scott Kveton
Vidoop

—-

Amber Case is a Cyborg Anthropologist from Portland, Oregon. She enjoys tech events and the minds of people who attend them. You can follow her on Twitter @caseorganic.