Using Census 2010 Summary File 1 with API Technology

The Census 2010 Summary File 1 (SF1) contains the most detailed Census 2010 summary statistic data available that is tabulated at the census block  level.  The SF1 data are, and will continue to be, an important data resource throughout the 2010s and beyond. The scope of subject matter includes cross-tabulations of age, sex, households, families, relationship to householder, housing units, detailed race and Hispanic or Latino origin groups, and group quarters.

Skip the downloads!  Using pre-API technology methods, you would download the very large state by state zip files and proceed through a number of steps to use the data.  An alternative would be to use the Census Bureau FactFinder, but use of this tool is infeasible for most types of dataset development operations.

As an example, the Texas Census 2010 SF1 — view these data here (not recommended) — is contained in a 815 MB downloadable zip file. The zip file expands into a set of 48 files comprised of a geographic segment and 47 comma-delimited (CSV) structured files.  The expanded 48 files require 8.5 GB of space, still in CSV structure.  Specialized software is then required to transform these data into usable structures/data. This is for reference/history, we can now skip over using that enormous and complex to use data.

Using API Technology
How things change!  Use of API technology to access these data is reviewed in this section.  By using API-based applications, you can avoid downloading the very large SF1. This section reviews how you can use the APIGateway to develop a census block dataset. Many statistical programs and all popular geographies are supported by this tool, but the focus here is on census block demographics from Census 2010 SF1.

The next series of steps illustrate how to develop the sample dataset shown in the graphic presented below.  The graphic shows a spreadsheet oriented view of rows corresponding to census blocks and columns comprised of a census block geoid (the geocode and structured shorthand by which the geographic area is referenced) followed by columns/fields of selected SF1 items.

Washington DC tract 004100 SF1 extract

Using the CV XE APIGateway with Census 2010 SF1
Follow these steps to run through the demonstration application.  You can then use the APIGateway tool to process on geography and subject matter items of interest to you.

1. Install CV XE GIS Software
Use the CV XE GIS installer to install the software on your Windows computer.  Take all default settings.  More information about CV XE GIS.

2. Get Washington, DC GIS Project Files
Expand http://proximityone.com/dmd/2013_dc_dp.zip to folder c:\cvxe\1.

3. Start CV XE.  With CV XE running, start APIGateway
Use File>APIGateway from main menu bar.  The APIGateway form appears as shown below.

apigateway

The Batch Extraction operation involves two steps.
1 – Click main menu Settings>Batch Operation and specify settings.
2 – Close the Setting form. Click Tools>Batch Extraction to start processing.

The Batch Operations setting form is shown below.
APIGateway Batch Extract Settings

These settings will operate with no modification.  To use these settings,  close the form and click Tools>Batch Extraction to start processing.  The output file c:\cvxe\1\blk_sf1_2010_p003p004.dbf is created (Output Dataset) and contains the fields shown in the Field Names edit box.  The output file created will contain ALL records that meet the criteria of the Control File Query (substr(geoid,1,11)=’11001004100′) and processing blocks contained in the Control File (c:\cvxe\1\tl_2013_11_tabblock_dp.dbf).  The graphic at the top of this section provides a partial view of the file created.

You can use this version of the APIGateway to create your own datasets for any area of the U.S.  The Batch Operation in this version operates only with Census 2010 SF1 block level data.  The full version operates with many types of source data and supports wide ranging geographic levels — including ACS block group level data.

See the APIGateway Guide for more details regarding operations.

Using the Dataset Generated
There are at least two main ways the output dataset can be used.
1 – the block level data may be aggregated/analyzed in a tabular manner.
2 – the dbase version of the output dataset is structured in a manner that can be immediately joined with a shapefile for mapping and geospatial applications.

More about the Sample Dataset
The first row of the dataset shows selected data for census block 11-001-004100-1000.  P0030001 is the fieldname/shorthand for Census 2010 total population.  You can see the list describing these items using the table shells (xls) — see table P3 in the xls file. Field/item.column p0030002 is the White alone population.  Field name spelling is nitpicky; one character missed, incorrect or out of place can cause an error.  See sequential page 184 (numbered page 6-22 in the matrix section)in the SF1 technical documentation (pdf) to view the exact spelling of the field names and an alternative view of the table structure.

Washington DC tract 004100 SF1 extract

See related post regarding Mapping Census Blocks.  The next post on using data generated by the APIGateway will be November 19,2013.

Leave a comment