Category Archives: Geocoding

Creating & Using Location Shapefiles

.. GIS tools and methods to develop and update location shapefiles .. location shapefiles are essential to most GIS applications. Location shapefiles, or point shapefiles, enable viewing/analyzing locations on a map and attributes of these locations such store or customer ID, street address, city, date updated, value, ZIP code and wide-ranging attributes about the location. This section reviews tools and methods to develop and use location shapefiles. See more detail about topics covered in this section in the related Web page.

Viewing/Analyzing Store Locations in the Dallas, TX Area
The following graphic illustrates how store locations can be shown in context of other geography and associated demographic-economic attributes. This view shows store locations (red markers) in context of Dallas city (blue cross-hatch pattern) and broader metro area. Markers shown in this view are based on a location shapefile created using steps described below. The identify tool is used to click on a location and show attributes in a mini-profile.

.. view developed with ProximityOne CV XE GIS and related GIS project.

View the locations contextually with thematic patterns by tract or other geography. Combine views of store, customer, agent, competitor and other location shapefiles.
The following view shows patterns of median household income by census tract.

.. view developed with ProximityOne CV XE GIS and related GIS project.

Development of location shapefiles often starts with a list of addresses. Locations are not always address-oriented; they might be geographically dispersed measurement or transaction locations — having no address assigned. In applications reviewed here, locations are organized as rows in a CSV file. Each CSV file contains like-structured attributes for each location. The example used in this section uses store locations located in the Dallas, TX area.

There are two basic methods used to create location shapefiles: 1) geocoding address-data contained in the source data file or 2) using the latitude-longitude of the location included in the source data file record. The focus here is on option 2 — using the latitude-longitude of the location already present in the source data file.

Creating a Location Shapefile
The process of creating a location shapefile uses the CV XE GIS Manage Location Shapefile feature. With CV running, the process is started with File>Tools>ManageLocationShapefile. The following form appears.

.. ManageLocationShapefile feature/operation in ProximityOne CV XE GIS.

CV XE GIS provides other ways to create location shapefiles:
• Tools>AddShapes>Points — click points on the map window canvas.
• Tools>FindAddress — creates a single point shapefile based on specified address.
• Tools>FindAddress (Batch) — creates a point shapefile based on specified file of address records.
See details in User Guide.

Steps to Create a Location Shapefile
The process of creating the shapefile “C:\cvxe\1\locations1pts.shp” can be viewed by clicking the Run button on the form (with CV running). Two input CSV structured files are required:
• data definition file
• source data file

There are two sets of illustration location input files included with the CV installer:
• locations1_dd.csv and locations1.csv (7 locations in Johnson County, KS)
• locations2_dd.csv and locations2.csv (252 locations in Dallas and Houston)
These files are located in the \1 (typically c:\cvxe\1) folder. The marker/location shapefile used in the map shown above was created using the lcoations2 input files.

Data Definition File
The Data Definition (DD) file is an ASCII/text file structured as a CSV file. It may created with any text editor. The DD file is specific to the source data file. But in the case of recurring source data files for different periods the same DD file might apply to many source data files. There are several rules and guidelines for development of the DD file:
• there is one line/record for each field in the source data file.
• each line/record must be structured in an exact form:
.. each line/record is comprised of exactly 4 elements separated by a comma:
.. 1 field name for subject matter item
– must consist of 1 to 10 characters and include no blanks or special characters
.. 2 field type: C for character, N for numeric
.. 3 field length: an integer specifying the maximum with of the field
.. 4 maximum number of decimals for field (value is 0 for character fields)
The DD File must include three final fields:
LATITUDE,n,12,6
LONGITUDE,n,12,6
GEOID,c,15,0
The structure of these three DD file records must be as shown above. The source data file, described below, must have the LATITUDE and LONGITUDE fields populated with accurate values. The GEOID field may populated with either an accurate value of placeholder value like 0.

Example. Data for each store for the default DD file name “C:\cvxe\1\locations1_dd.csv” include the following fields/attributes:
  NAME,C,45,0
STORE,c,15,0
ADDRESS,c,60,0
CITY,c,40,0
LATITUDE,n,12,6
LONGITUDE,n,12,6
GEOID,c,15,0

Optionally create a DD File using the Create DD File button on the form. Clicking this button will create a DD File containing attributes of the dBase file specified in the associated edit box. The DD File name is created from the dBase file name. If the dBase file name is “c:\cvxe\1\locations1pts.dbf”, the DD File will be named “c:\cvxe\1\locations1pts_dd.csv”.

About the GEOID
The GEOID is a 15 character code which defines the Census 2010 census block containing each location. The GEOID is generally assigned by the ManageLocationShapefile operation and is one of the important and distinctive features of this tool. The GEOID is used to uniquely determine, with the GIS application, any of the following: state, county, census tract, block group, or census block.

The GEOID, as used in this section, is the 15 character Census 2010 geocode for the census block. The GEOID value 481130002011012 (see in location profile in map at top of section) is structured as:
state FIPS code: 48 (2 chars)
county FIPS code: 113 (3 chars)
census tract code 000201 (6 chars)
census block code: 1012 (4 chars) (block group code: 1 — first of 4 characters)

About the Source Data File
The Source Data File is an ASCII/text file structured as a CSV file. It is typically developed by exporting/saving an Excel or dBase file in CSV structure. There are several rules and guidelines for development of the source data file:
• fields must be structured and arranged as defined in the DD File.
• character fields must not contain embedded commas.
• final items in record sequence must be:
.. LATITUDE – must have accurate decimal degree value; 6 digit precision suggested.
.. LONGITUDE- must have accurate decimal degree value; 6 digit precision suggested.
.. GEOID – this may be 0, not assigned or the accurately assigned GEOID value.
– optionally create/rewrite the GEOID used in the new shapefile.

Updates; Combining Vintages of Location Attributes
Location based data might update frequently, even daily. The recommended method to add, update and extend the scope of location-based data is to create new address shapefiles corresponding to different vintages or dates covered. The structure of the files must be the same so that they files can be used together or separately. Suppose there is one set of data covering year to date and a second set of data covering the following month. The ManagePointShapefile operation would be run once for each time period. Two shapefiles would be created. These shapefiles may be added to a GIS project and used separately or in combination to view/analyze patterns.

Join me in a Data Analytics Lab session to discuss more details about accessing and using wide-ranging demographic-economic data and data analytics. Learn more about using these data for areas and applications of interest.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.

115th Congressional Districts: Analysis and Insights

.. interpretative data analytics; tools, data & methods ..  this section is focused on 115th Congressional District geographic, demographic and economic patterns and characteristics. Use tools and data reviewed here to examine/analyze characteristics of one congressional district (CD) or a group of CDs based on state, party or other attribute. Use the GIS resources described here for general CD reference/pattern/analytical views, to examine current demographics and demographic change and for redistricting applications. See this related Web section for more details.

Examining the 115th Congressional Districts
• the 115th Congress runs from January 2017 through December 2018.
• FL, MN, NC, VA have redistricted since the 114th CD vintage;
  .. some 115th CDs have new boundaries compared the 114th CDs.
• view, rank, compare CDs using the interactive table.
  .. table uses ACS 2015 data for 115th CDs & include incumbent attributes.
  .. examine districts by party affiliation.
• use these more detailed 114th CD interactive tables
  .. data based on 2015 American Community Survey – ACS 2015.
  .. corresponding data for the 115th CDs from ACS 2016 available Sept 2017.
• use the new GIS project including 114th & 115th CDs described below.
  .. create CD thematic and reference maps;
  .. examine CDs in context of other geography & subject matter.
• join us in the April 25 Data Analytics Lab session

Visual Analysis of Congressional Districts
The following views 1) provide insights into patterns among the 115th CDs and 2) illustrate how 114th to 115th geographic change can be examined. Use CV XE GIS software with the GIS project to create and examine alternative views.

Patterns of Household Income by 115th Congressional District
The following graphic shows the patterns of the median household income by 115th Congressional District based on the American Community Survey 2015 1-year estimates (ACS2015). The legend in the lower left shows data intervals and color/pattern assignment

.. view developed with ProximityOne CV XE GIS and related GIS project.

Charlotte NC-SC Metro Area
  – with 114th/115th Congressional District 12

The following graphic shows North Carolina CD 12 with 114th boundary (blue) and 115th boundary (pale yellow) and Charlotte metro bold brown boundary. Click graphic for larger view with more detail. Expand browser window for best view.

.. view developed using the CVGIS software.

• View zoom-in to Charlotte city & Mecklenburg County.

115th Congressional District Interactive Table
Use the interactive table to examine characteristics of one congressional district (CD) or a group of CDs. The following graphic illustrates use of the interactive table. First, the party type was selected, Democratic incumbents in this example. Next, the income and educational attainment columns were selected. Third, the set of districts were sorted on median household income. It is quick and easy to determine that CA18 has the highest median household income and that the MHI is $1,139,900. Try using the table to examine districts of interest.

Join me in a Data Analytics Lab session to discuss more details about accessing and using wide-ranging demographic-economic data and data analytics. Learn more about using these data for areas and applications of interest.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.

Metro Situation & Outlook Reports Updated

Regional Demographic-Economic Modeling System (RDEMS) county table links are now embedded in Metro Situation & Outlook (S&O) Reports. Easily access the RDEMS county demographic-economic tables for metros of interest.

Use this link to access the Metro S&O Reports:
http://proximityone.com/metro_reports.htm
… click link in the “Code” column to access a specific metro.

… selected metros …
Atlanta .. Boston .. Charlotte .. Chicago .. Dallas .. Denver .. Los Angeles .. Honolulu .. Houston .. Miami .. Minneapolis .. New York .. Philadelphia .. Phoenix .. San Diego .. San Francisco .. Seattle .. Washington

All metros are available.

Join in … join us in the Data Analytics Lab sessions to discuss more details about accessing and using wide-ranging demographic-economic data and data analytics. Learn more about using these data for areas and applications of interest.

State and Regional Decision-Making Information

Organized on a state-by-state basis, use tools and geographic, demographic and economic data resources in these sections to facilitate planning and analysis. Updated frequently, these sections provide a unique means to access to multi-sourced data to develop insights into patterns, characteristics and trends on wide-ranging issues. Bookmark the related main Web page; keep up-to-date.

Using these Resources
Knowing “where we are” and “how things have changed” are key factors in knowing about the where, when and how of future change — and how that change might impact you. There are many sources of this knowledge. Often the required data do not knit together in an ideal manner. Key data are available for different types of geography, become available at different points in time and are often not the perfect subject matter. These sections provide access to relevant data and a means to consume the data more effectively than might otherwise be possible. Use these data, tools and resources in combination with other data to perform wide-ranging data analytics. See examples.

Select a State/Area

Alabama
Alaska
Arizona
Arkansas
California
Colorado
Connecticut
Delaware
D.C.
Florida
Georgia
Hawaii
Idaho
Illinois
Indiana
Iowa
Kansas
Kentucky
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Mississippi
Missouri
Montana
Nebraska
Nevada
New Hampshire
New Jersey
New Mexico
New York
North Carolina
North Dakota
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
South Dakota
Tennessee
Texas
Utah
Vermont
Virginia
Washington
West Virginia
Wisconsin
Wyoming

Topics for each State — with drill-down to census block
Visual pattern analysis tools … using GIS resources
Digital Map Database
Situation & Outlook
Metropolitan Areas
Congressional Districts
Counties
Cities/Places
Census Tracts
ZIP Code Areas
K-12 Education, Schools & School Districts
Block Groups
Census Blocks

Join me in a Data Analytics Lab session to discuss more details about accessing and using wide-ranging demographic-economic data and data analytics. Learn more about using these data for areas and applications of interest.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.

Relating Addresses to Digital Road Segments

.. enter an address on a Google Maps page, or many other similar Web-based “find address” tools, and view that address on a map. View street detail easily. But a view of the digital road segment is not available. Answers to these questions are not available:
• what are the demographic-economic attributes for that location?
• what are the left and right side address ranges for that road segment?
• what are the geocodes on the left and right sides of that road segment?
• what are the end-point coordinates for the road segment?
• how does that road segment (the route) extend through the city or county?

Use methods described in this section to answer these types of questions and examine “Address to Digital Road Segment” relationships. An example is used that features an address shown as a red marker in the banner graphic at the top of this page (Kansas City, MO area). That address is also used in the interactive data access form below. Use these resources and methods for most addresses in the U.S. Tools reviewed here are available at no fee. Use related Web section for full functionality.

Viewing Address/Location in Context of Digital Roads
In the following graphic, a geocoded address (point shapefile) is shown as a red marker (see pointer) in context of roads (streets/lines shapefile). The Identify tool is used to show the profile of associated road segment (TLID=91447676) — an intersection to intersection segment of Oak St. Performing a query on the roads shapefile/layer to locate ID 91447676, the segment is displayed as yellow-highlighted. This application/view is reviewed in this section.

Finding Address Attributes
Use the form below to find attributes for an address. To get started, click the Find button with default settings. Results are returned/displayed on this same page. More in general, key in an address of interest, select the type of geography (more about this below) and click Find. Try your own addresses of interest; use this tool to meet recurring address lookups. Not clear on all the steps? Join us in a Data Analytics Lab session, get answers to questions.

Click following graphic for full functionality:

About the Data Content and Structure
When the Find button (above) is clicked, this page refreshes with returned data based on your query — the values entered/selected in the section above the Find button. The first portion of the data displayed provide a structured display of selected subject matter items (ACS 5-year estimates) for the type of geography selected (e.g. tracts) for that area in which the address is located. The scope of those items could be substantially expanded.

Following that portion of the display are the geographic attributes displayed as JSON output resulting from the geocoder processing of the address. This is the display content below the text “Summary of address sent and matched results:”.

Road Segment Attributes
See road segment attributes for this address under “Summary of address sent and matched results:” and “Address: {“. See that the road segment ID is shown by “tigerLineId”: “91447676”. This ID uniquely identifies this road segment among all of the more than 45 million road segments in the U.S. (see more detail in related section).

Viewing the Address with GIS Tools
1. Install CV XE GIS software (if already installed, skip step 1).
.. Install package — Windows 32/64
.. Start-up Readme
2. After installation, with CV XE GIS running, open a GIS project:
.. Use File>Open>Dialog and open the GIS project named c:\cvxe\1\cvxe_us2.gis. Perform these steps:
2.1. In the Legend Panel, uncheck layers $MHI x BG and Locations.
2.2. Use the Add Layer button to add the layer c:\cvxe\1\$$address1.shp.
.. address used in this application now appears as red marker
.. this single point shapefile was created using the CV XE Tools>FindAddress
2.3. In Legend Panel, check layer Jackson Cty MO roads.
.. now, all roads in the county can be viewed with marker.
.. optionally zoom-in to develop view like shown at top of this page.

.. view developed with ProximityOne CV XE GIS and related GIS project.

2.4. Use GIS Tools>FindShape tool to show all occurrences of “Oak St”
.. on the FindShape form the FULLNAME field is selected and set to “like” Oak St% .. all road segments having name like “Oak St%” are highlighted in yellow.

.. view developed with ProximityOne CV XE GIS and related GIS project.

2.5. Use GIS data table feature to show all occurrences of “Oak St” as table.
.. there are 155 road segment, shown in the next graphic with left and right-side address ranges .. optionally export for use in other applications.

Not clear on all the steps? Join us in a Data Analytics Lab session, get answers to questions.

Census Block Code
See section starting with “2010 Census Blocks”. In the default address run, the census block shows as “GEOID”: “290950066002016”: … state: 29 … county: 095 … tract: 006600 … block: 2016. Other items, such as the TIGER/Line segment ID and segment side, can also be important for some applications.

Any given address or location is contained with several types of statistical areas (e.g. census tract or block group) and political areas (e.g. city or county). We may want to know the demographic-economic characteristics of a location for any one or several of these geographies. Use the interactive tool on this page to access those data. For example, access/view the median household income of the location/address block group or the median household income the location/address city.

Join me in a Data Analytics Lab session to discuss more details about analyzing citizen voting age population and use of data analytics to develop further detail related to your interests.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.

Crime Data Analytics

Goto ProximityOne .. examining crime incidence and socioeconomic patterns and analyzing small-area and location-based data.

.. what are the crime patterns in neighborhoods or areas of interest? It is challenging to get useful answers to this type of question. Crime incidence data by location/address are often difficult or not possible to obtain. Even where the location-based crime data are available, the data must be geocoded, e.g., assigned a census block code to each address. Separately demographic-economic must be organized to examine contextually with the crime data.

Integrating Crimes by Location & Patterns of Economic Prosperity
– View developed using CV XE GIS and related GIS project.

Crime Data Analytics. Use the Crime Incidence and Socioeconomic Patterns GIS project and associated datasets to explore relationships between crime and small area demographic-economic characteristics. Follow the steps described below to study patterns and relationships in Kansas City and/or use this framework to develop similar data analytics for other areas.

Framework for a case study. 409 of Missouri’s 4,506 block groups are within the jurisdiction of the Kansas City police department (KCPD) and had one or more crimes in 2014 (latest fully reported year). There were approximately 10,400 crimes recorded by the KCPD in 2014, in the city area spanning four counties. In this section tools and data are used to examine crime patterns in Kansas City, MO. Crime data are included as markers/locations in a GIS project. Crime data are also aggregated to the census block level and examined as summary data (aggregate crimes by census block). Crime data are related to American Community Survey (ACS) 2014 5-year demographic-economic data at the block group geographic level.

To perform these types of analyses, it is important to start with location-based crime data that have been attributed with type of offense (offense code). Ideally, each crime incidence data record includes minimally the offense code and address of the crime. Such location-based crime incidence data have been acquired from the KCPD. These data are used to develop a shapefile that can be included in a GIS project.

Patterns of Crime Incidence in Kansas City, MO
The following graphic shows patterns of crime incidence by census block for the “Plaza Area” within Kansas city. This view shows all types of crimes aggregated to the census block level. Crimes committed where a handgun was involved are shown as black/red circular markers. Click the graphic for a larger view that shows legend and more detail.
– View developed using CV XE GIS and related GIS project.

Related views (click link to view graphic in new window):
Use the GIS project to develop variations of these views. Optionally add your own data.
Lay of the land: Kansas City city (cross hatched) in context of metro
All crimes as markers in Kansas City in 2014

Patterns of Economic Prosperity & Crime Incidence
The following graphic shows patterns of economic prosperity (median household income $MHI) by block group for the same general area as above. This view illustrates how two types of crimes (burglary blue triangle markers and homicide (red/black square markers) can be examined in context. Click the graphic for a larger view that shows legend and more detail.

– View developed using CV XE GIS and related GIS project.

Related views (click link to view graphic in new window):
Use the GIS project to develop variations of these views.
View similar to above, without $MHI layer

Data used to analyze patterns of economic prosperity/$MHI are based on the American Community Survey (ACS) 2014 5-year estimates at the block group geographic level. The same scope of subject matter is available for higher level geography. The GIS project/datasets includes many types of demographic-economic subject matter that can be used to display/analyze different socioeconomic patterns.

Using Block Group Geography/Data
Census Block Groups sit in a “mid-range” geography between census blocks and census tracts. All cover the U.S. wall-to-wall and nest together, census blocks being the lowest common denominator for each. Block Groups (BGs) are the smallest geographic area for which annually updated ACS 5-year estimates data are tabulated.

Advantages of using BG geodemographics include the maximum degree of geographic drill-down (using ACS data) … enabling the most micro-perspective of demographics for a neighborhood or part of study area. A disadvantages of using BG estimates is that typically the smaller area estimates have a relatively higher error of estimate.

Crime Incidence and Socioeconomic Patterns GIS Project/Datasets
1. Install the ProximityOne CV XE GIS
… omit this step if CV XE GIS software already installed.
… run the CV XE GIS installer
… take all defaults during installation
2. Download the CISP GIS Project fileset
… requires ProximityOne User Group ID (join now)
… unzip CISP GIS project files to local folder c:\crime
3. Open the kcmo_crimes_2014.gis project
… after completing the above steps, click File>Open>Dialog
… open the file named C:\crime\kcmo_crimes_2014.gis
4. Done .. the start-up view shows the crime patterns.

Weekly Data Analytics Lab Sessions
Join me in a Data Analytics Lab session to discuss more details about accessing location-based data and block group demographics and integrating those data into analytical applications.  Learn more about integrating these data with other geography, your data and use of data analytics that apply to your situation.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.

Metropolitan Areas & Fortune 1000 Companies

.. examining Fortune 1000 companies by metro .. the mix of small and large businesses can be important to the dynamic and opportunities for any area. Larger businesses often provide a hub for the evolvement and expansion of small businesses – an important ingredient to employment growth and economic progress. Visit the related Web section for more details.

Not surprising, the New York metro leads the way with headquarters to 115 Fortune 1000 companies. Check out the full list of metros with links to the corresponding metro report.

The Metropolitan Area Situation & Outlook Reports provide insights into the integrated mix of of geographic, demographic, economic and business activity and patterns for individual metropolitan areas. The reports provide a summary of Fortune 1000 companies, notable businesses, government operations and other entities impacting the metro.

This section provides a summary of the number of Fortune 1000 companies derived from individual metro reports. 152 metros have one or more Fortune 1000 companies with headquarters in the metro. View the list of metros showing the number of companies as described below.

Fortune 1000 Companies by Metro
The graphic below shows Fortune 1000 companies as red markers in context of metropolitan statistical areas. Company addresses were geocoded, determining census block code. From the census block code, the county and metro were assigned.

– View developed using CV XE GIS and related GIS project.
– Click graphic for larger view and bolder metro outlines.

Number of Fortune 1000 Companies by Metro
The graphic presented below shows the metros having the largest number of Fortune 1000 companies. See the full list in the related Web section. On that page, click a link to view Metro Report and list of Fortune 1000 companies located in the report (section 2). To look up metros of interest, use the all metros table.

See the related Web section for more details.

About the Author
— Warren Glimpse is former senior Census Bureau statistician responsible for innovative data access and use operations. He is also the former associate director of the U.S. Office of Federal Statistical Policy and Standards for data access and use. He has more than 20 years of experience in the private sector developing data resources and tools for integration and analysis of geographic, demographic, economic and business data. Contact Warren. Join Warren on LinkedIn.