Saturday, February 25, 2012

Well, it was a good exercise - ;-)

SO, in my Feb 18 post, I commented:
Now the question is how to convert longitude and latitude to positions on the image - not a DIFFICULT problem, but one that took some thought. I think I worked it out during a long drive this weekend. Will try to implement tomorrow. 
I spent a good hour or two earlier this AM getting it implemented. Then I moved forward in my book and about 6 pages later came upon the introduction of the map() function, which "converts numbers from one range to another," exactly my task. The author goes on to say "A lot of visualization problems revolve around mapping data from one range to another, so the map() method is used frequently."

Oh well. As I said - it was a good exercise. I'll go back and modify my code to take advantage of the map() function and make sure that I can make it work.

note - And indeed, it did work. And in the process I discovered that it can also reverse the axis in the scaling process - scaling the range (36 - 25) to (0 - 545).

Too long since I've done any coding :-)

So, back at the TX vis after an intense week on other things. Rather than start out right off including the code that converts lat and long positions into x and y on the image, I decided to do the conversion in Excel. Got that done quickly enough, then started trying to draw the dots onto the map. THIS went on for a couple of hours, with one error message after another - and all of them new to me. I moved data files back and forth from the example that I was copying. Moved map files back and forth. And FINALLY, as I worked through the two programs, side by side, line by line, I recalled a fundamental fact about computers and programming. You start counting from ZERO. So if you want to retrieve the elements in the second and third columns of a data set, you call them columns 1 and 2. Now the "out of bounds" error messages make more sense. I now have a TX map with about 350 dots drawn all over it. Need to get the lats/longs for the other 800 or so. 

Saturday, February 18, 2012

Positioning locations on the map

So the problems are falling one at a time on this - (1) have learned how to draw a map (or any image, for that matter) to the window, (2) have acquired a list of all of the districts in TX (all of the schools, for that matter, which I was able to parse down to a list of districts), (3) have figured out how to convert the addresses to longitude and latitude. Now the question is how to convert longitude and latitude to positions on the image - not a DIFFICULT problem, but one that took some thought. I think I worked it out during a long drive this weekend. Will try to implement tomorrow.

Thursday, February 16, 2012

Starting the Texas Ed Vis

So an early exercise in the Processing book is to display a map of the US, then draw a circle at the center of each state, reading the position data from a file. This aligns very much with my visualization project, which will involve drawing circles at the location of each TX school district.

Downloaded a TEA data file containing info on every school in the state. Extracted from it a list of every district in the state. Contained addresses (sometimes a PO box) for each district. Went looking for a way to convert addresses into longitude and latitude.

Found and tried a couple of things, but ultimately landed (surprise, surprise) at Google Code and the Google Maps API. One of the Web Services that they offer is a Geocoding API. Geocoding is the process of converting an address to latitude and longitude. Now I'm in WELL over my head, but I learn that you can send a request to the API in the form of an http (statement?). The statement that I ultimately sent looked like this:

http://maps.googleapis.com/maps/api/geocode/json?address=2211+Lawnmont+Ave,+Austin,+TX&sensor=false

and it delivered a result that included something like this:

"formatted_address" : "2211 Lawnmont Ave, Austin, TX 78756, USA",
"geometry" : {
"location" : {
"lat" : 30.3268790,
"lng" : -97.74253999999999
}

So I'd managed to execute the task for one address. Now the question is how to do it for the 1250 addresses I have for Texas school districts. Time to consult w/ David Kim...

FYI - info on Google Geocoding API is here:

http://code.google.com/apis/maps/documentation/geocoding/

Wednesday, February 15, 2012

First Processing Code

Got through most of the Ch 01 coding examples this eve. Changed the NEAR burnt orange background that was used in the examples to a TRUE burnt orange - RGB = 199, 91, 18.

Tested each of the export functions - for web deployment or as a standalone application. Tested the standalone app for Mac (not windows or linux). Also tested the saveFrame function and created more than 5000 identical TIFF files before I was able to stop the app. Also discovered that Processing is case sensitive!!!

I find it interesting that Processing can interact with a variety of renderers depending on need.

Monday, February 13, 2012

Visualizing Data - Ch 01

Read Chapter 1 of Visualizing Data, by Ben Fry.

"One of the most important skills in understanding data is asking good questions." p 4

Seven steps to answering a question with data:

Acquire - get the data
Parse - organize the data
Filter - remove data that is not of interest
Mine - use statistical or data mining techniques to discern patterns in the data
Represent - choose a visual model
Refine - improve upon the model to make it clearer and more engaging
Interact - provides methods to control what features are visible

Information visualization refers to the process of representing data that is primarily numeric or symbolic in nature.

Wednesday, February 8, 2012

Illuminating the Path (and Processing)

Downloaded and installed Processing (again, since my catastrophic HD crash of a few months ago).

Started to read Illuminating the Path: The Research and Development Agenda for Visual Analytics. The first thing that comes to my attention is that the goal of the volume is to develop an agenda that is SPECIFICALLY focused on vis analytics for the purpose of enhancing homeland security. I'm, then, forced to wonder how appropriate this particular agenda is for MY stated purpose - applying vis analytics to educational data. Whatever the case, Kelly Gaither's response, when I told her that I was considering this as one of the central texts for my study, was that this is ESSENTIAL reading. With that endorsement, I begin my review of this volume.

How about this statement of need:
"New methods are required that will allow the analyst to examine this massive, multi-dimensional, multi-source, time-varying information stream to make decisions in a time critical manner."


Here's a quote from page 5 that addresses my earlier stated concern:
"Although the agenda described herein is focused specifically on meeting homeland security challenges, the new capabilities created will have an impact on a wide variety of fields ranging from business to scientific research, in which understanding complex and dynamic information is important."

From page 9:
"interactions must be adaptable for use in platforms ranging from the large displays in emergency management control rooms to field-deployable handheld devices"

Tuesday, February 7, 2012

So far...

I've decided to track my activities for my independent study course for the spring of 2012 via this blog. I've been at it for a few weeks, so this will be a bit of a catch-up.

I'm going to focus on three related topics:

  • Educational data mining
  • Visual analytics
  • Information visualization
I've identified a text related to each and have begun reading the text on EDM. Early chapters are surveying techniques of (educational) data mining. It is worth noting that the second chapter of the book (chapter 1 was the Intro) was entitled Visualization in Educational Environments. 

Chapter 3 is a useful review of the basics of statistical analysis. 
Chapter 4 describes the Pittsburgh Science of Learning Center (PSLC) DataShop. This resource provides (1) an environment for data storage and dissemination that aligns with the demands of the NSF data management plan requirement and (2) a ready repository of educational datasets. 
Chapter 5, which I'm currently reading, addresses classification as a data mining process. 

I've also started to develop a website which will serve as a repository for my learning, located here:


Falling asleep, so I'll publish this one and pick it up tomorrow.