Data Sources Used by Run on Sun
For the various data visualizations displayed on the Run on Sun website we have made use of a wide variety of data sources. To enhance our visitors’ understanding of how these visualizations were created, this page contains links and descriptions of these data sources:
- California Solar Initiative (CSI) Data
- National Oceanic & Atmospheric Administration (NOAA) Solar Formulas
- Southern California Edison Rate Filings (Tariffs)
- U.S. Census Bureau
California Solar Initiative (CSI) Data - Now California Distributed Generation Statistics
When the California Solar Initiative ceased in 2014, the CSI data, and subsequent data regarding distributed generation in California, moved to the California Distributed Generation Statistics site.
The CSI data set has to be processed before it can be used for any analysis. In particular, extraneous lines are found in the data set as presented on the website and they must be eliminated before pivot tables or SQL queries will work properly.
From time-to-time we have uploaded versions of the CSI data set that we used for a particular analysis. For example, our August, 2013 analysis of the State of SoCal Solar included a link to this zipped data set file (11 MB) that was the basis for that series of posts.
NOAA Solar Calculation Formulas
Calculating sunrise and sunset times requires a surprisingly complicated set of equations. Fortunately, the folks at the National Oceanic & Atmospheric Administration have published a spreadsheet with useful formulas:
From their website:
The calculations in the NOAA Sunrise/Sunset and Solar Position Calculators are based on equations from Astronomical Algorithms, by Jean Meeus. The sunrise and sunset results are theoretically accurate to within a minute for locations between +/- 72° latitude, and within 10 minutes outside of those latitudes. However, due to variations in atmospheric composition, temperature, pressure and conditions, observed values may vary from calculations.
The following spreadsheets can be used to calculate solar data for a day or a year at a specified site. They are available in Microsoft Excel and Open Office format. Please note that calculations in the spreadsheets are only valid for dates between 1901 and 2099, due to an approximation used in the Julian Day calculation. The web calculator does not use this approximation, and can report values between the years -2000 and +3000.
Southern California Edison (SCE) Rate Filings (Tariffs)
SCE's website tracks their rate tariff filings, and you can find them on their Regulator Information webpage.
Keep in mind that even though these rates are regulated by the CPUC, they may change without notice.
United States Census Bureau
The U.S. Census Bureau is a treasure trove of useful data, all of which is available to the public.
We have used a variety of sources, some simply to answer a specific question, and others as the source of entire data sets.
State & County QuickFacts
The QuickFacts page provides the answer to a host of questions, such as, how many households are there in the U.S.?
According to the Census Bureau, as of 2011, there were 114,761,359 households in the United States. For our visualization on how fast new solar installations would have to occur in a year, we rounded that number up to an even 115 million.
Census Data Search Page
The Census bureau provides a sophisticated search tool to allow users to locate just the data that they are seeking. Given that the Census website houses thousands of data tables, this is a very important resource!
For example, for our visualization regarding solar installations in middle class neighborhoods, we used a table from the Census website titled, Household Income in the Past 12 Months (in 2011 Inflation-Adjusted Dollars), to calculate the median household income in California by zip code.
Zip Code Statistics
The Census Bureau also maintains an enormous amount of data correlated to zip codes.