This what i have so far loading libraries import numpy as np

This what i have so far

#loading libraries

import numpy as np

import csv

with open(\'energy.csv\', \'rb\') as csvfile:

lines = csv.reader(csvfile)

for row in lines:

print \', \'.join(row)

import csv

with open(\'energy.csv\', \'rb\') as csvfile:

lines = csv.reader(csvfile)

for row in lines:

print \', \'.join(row)

# define column names

names = [\'Countries\', \'Years\', \'Number of Countries\', \'Mean\', \'Small Production\',

\'Average Production\', \'Large Production\']

# create design matrix X and target vector y

X = np.array(df.ix[:, 0:4]) # end index is exclusive

y = np.array(df[\'Countries\']) # another way of indexing a pandas df

Assignment:

The file energy.csv contains the values of energy generated for each country in Terawatt Hours for different years(fromOECD Factbook 2011: Economic, Environmental and Social Statistics).

The data contains missing values (denoted by..) and also aggregate rows for the EU, OECD, and World countries.

Requirements

You are to create a program in Python that performs the following:

1.Loads the energy.csv file (assume it\'s in the current directory) and create a DataFrame object from it.

2.Replaces the missing values with the mean energy production for that country.

3.Removes the data for the aggregate values (EU27, OECD, World)

4.Adds a new column called \'Continent\' and fills it in with the continent that corresponds to each country.

5.Creates a DataFrame that contains the continent name as the index and the following columns:

a.\'num_countries\'=numberof countries for this continent

b.\'mean\'=mean energy production of the countries in this continent

c.\'small_production \' =1 if continent mean is less than the mean production of all countries minus one standard deviation; 0 otherwise

d.\'avg_production\' = 1 if continent mean is greater than the mean production of all countries minus one standard deviation , but less than the mean production of all countries plus one standard deviation; 0 otherwise

e.\'large_production\' = 1 if continent mean is greater than the mean production of all countries plus one standard deviation; 0 otherwise

6.Display the new DataFrame to the screen

1. The name of your source code file should be DataPrep.py. All your code should be withina single file.

2.You need to use the pandas DataFrame object for storing data.

3.Your code should follow good coding practices, including good useof whitespace and use of both inline and block comments.

4.You need to use meaningful identifier names that conform to standard naming conventions.

5.At the top of each file, you need to put in a block comment with the following information: your name,

date, course name, semester, and assignment name.

Data File: energy.csv

Continent Map?

u\'Australia\':\'Australia\',
u\'Austria\':\'Europe\',
u\'Belgium\':\'Europe\',
u\'Canada\':\'North America\',
u\'Chile\':\'South America\',
u\'CzechRepublic\':\'Europe\',
u\'Denmark\':\'Europe\',
u\'Estonia\':\'Europe\',
u\'Finland\':\'Europe\',
u\'France\':\'Europe\',
u\'Germany\':\'Europe\',
u\'Greece\':\'Europe\',
u\'Hungary\':\'Europe\',
u\'Iceland\':\'Europe\',
u\'Ireland\':\'Europe\',
u\'Israel\':\'Asia\',
u\'Italy\':\'Europe\',
u\'Japan\':\'Asia\',
u\'Korea\':\'Asia\',
u\'Luxembourg\':\'Europe\',
u\'Mexico\':\'North America\',
u\'Netherlands\':\'Europe\',
u\'NewZealand\':\'Oceania\',
u\'Norway\':\'Europe\',
u\'Poland\':\'Europe\',
u\'Portugal\':\'Europe\',
u\'SlovakRepublic\':\'Europe\',
u\'Slovenia\':\'Europe\',
u\'Spain\':\'Europe\',
u\'Sweden\':\'Europe\',
u\'Switzerland\':\'Europe\',
u\'Turkey\':\'Asia\',
u\'UnitedKingdom\':\'Europe\',
u\'UnitedStates\':\'North America\',
u\'Brazil\':\'South America\',
u\'China\':\'Asia\',
u\'India\':\'Asia\',
u\'Indonesia\':\'Asia\',
u\'RussianFederation\':\'Europe\',
u\'SouthAfrica\':\'Africa\'

	1971	1990	1999	2000	2001	2002	2003	2004	2005	2006	2007	2008	2009	2010
Australia	53	154.3	203.6	209.9	224.3	227.4	226.3	236.3	245.2	247	250.8	257.1	260.9	256.2
Austria	28.2	49.3	59.7	59.9	60.9	60.4	57.7	61.5	63.6	61.7	62.2	64.1	65.6	67
Belgium	33.2	70.3	83.4	82.8	78.6	80.9	83.6	84.4	85.7	84.3	87.5	83.6	89.8	95.1
Canada	221.8	482	578.9	605.6	589.8	601.2	589.5	599.9	626	615.9	642	640.9	603.1	598
Chile	8.5	18.4	38.4	40.1	42.5	43.7	46.8	51.2	52.5	55.3	58.5	59.7	60.7	62.5
CzechRepublic	36.4	62.3	64.2	72.9	74.2	76	82.8	83.8	81.9	83.7	87.8	83.2	81.7	85.3
Denmark	18.6	26	38.9	36.1	37.7	39.3	46.2	40.4	36.2	45.6	39.3	36.6	36.4	38.6
Estonia	..	17.4	8.3	8.5	8.5	8.6	10.2	10.3	10.2	9.7	12.2	10.6	8.8	13
Finland	21.7	54.4	69.5	70	74.5	74.9	84.2	85.8	70.6	82.3	81.2	77.4	72.1	80.4
France	155.8	417.2	521.3	536.1	545.7	553.9	561.8	569.1	571.5	569.3	564.4	569.5	537.4	567.6
Germany	327.2	547.7	552.5	572.3	581.9	582	601.5	608.5	613.4	629.4	629.5	631.2	586.4	614.1
Greece	11.6	34.8	49.4	53.4	53.1	53.9	57.9	58.8	59.4	60.2	62.7	62.9	61.1	60.8
Hungary	15	28.4	37.8	35.2	36.4	36.2	34.1	33.7	35.8	35.9	40	40	35.9	37.4
Iceland	1.6	4.5	7.2	7.7	8	8.4	8.5	8.6	8.7	9.9	12	16.5	16.8	17.1
Ireland	6.3	14.2	21.8	23.7	24.6	24.8	24.9	25.2	25.6	27.1	27.9	29.9	27.9	28.3
Israel	7.6	20.9	39.2	42.7	44	45.5	47	47.3	48.6	50.6	53.8	57	55	57.2
Italy	123.9	213.1	259.3	269.9	271.9	277.5	286.3	295.8	296.8	307.7	308.2	313.5	288.3	295
Japan	382.9	835.5	1028.1	1049	1030.3	1049	1038.4	1068.3	1089.9	1094.8	1125.5	1075.5	1041	1071.3
Korea	10.5	105.4	235.6	288.5	309.1	329.8	343.2	366.6	387.9	402.3	425.9	443.9	451.7	478
Luxembourg	1.3	0.6	0.4	0.4	0.9	2.8	2.8	3.4	3.3	3.5	3.2	2.7	3.2	3.2
Mexico	31	115.8	190	204.2	211.9	215.9	213.7	232.6	243.8	249.5	257.2	261.9	261	268.4
Netherlands	44.9	71.9	86.7	89.6	93.7	95.9	96.8	102.4	100.2	98.4	105.2	107.6	113.5	114.7
NewZealand	15.5	32.3	37.8	39.2	39.9	40.7	40.8	42.5	43	43.6	43.8	43.9	43.5	44.8
Norway	63.5	121.6	122.3	139.6	119.2	130.3	106.8	110.2	137.2	121.2	136.1	141.2	132	124.1
Poland	69.5	134.4	140	143.2	143.7	142.5	150	152.6	155.4	160.8	158.8	154.7	151.1	157
Portugal	7.9	28.4	42.9	43.4	46.2	45.7	46.5	44.8	46.2	48.6	46.9	45.5	49.5	52.7
SlovakRepublic	10.9	25.5	28.1	30.8	31.9	32.2	31	30.5	31.4	31.3	27.9	28.8	25.9	27.3
Slovenia	..	12.4	13.3	13.6	14.5	14.6	13.8	15.3	15.1	15.1	15	16.4	16.4	16.2
Spain	61.6	151.2	205.9	222.2	233.2	241.6	257.9	277.2	288.9	295.5	301.8	311.1	291	295.3
Sweden	66.5	146	154.8	145.2	161.6	146.7	135.4	151.7	158.4	143.3	148.8	149.9	136.6	152.8
Switzerland	31.2	55	68.7	66.1	71.1	65.5	65.4	63.9	57.8	62.1	66.4	67	66.7	66.6
Turkey	9.8	57.5	116.4	124.9	122.7	129.4	140.6	150.7	162	176.3	191.6	198.4	194.8	211.2
UnitedKingdom	255.8	317.8	365.3	374.4	382.4	384.6	395.5	391.3	395.4	393.4	393	384.6	372	378.1
UnitedStates	1703.4	3202.8	3873.6	4025.9	3838.8	4026.4	4054.6	4148.1	4268.9	4275	4323.9	4343	4165.4	4337.1
EU27total	..	2567.8	2914.3	2996.7	3077.5	3099	3187.5	3254.2	3274.5	3318.9	3333.4	3339.4	3178.3	..
OECDtotal	3836.9	7629.3	9343.3	9726.9	9607.5	9888	9982.6	10252.7	10516.6	10590.3	10790.9	10809.8	10403.1	10772.2
Brazil	51.6	222.8	334.7	349.2	328.2	346	365.3	387.9	403.4	419.9	445.8	463.4	466.5	..
China	138.4	621.2	1239.8	1356.2	1472.4	1641.4	1908.5	2201	2499.7	2864.3	3276.3	3458.8	3695.9	..
India	66.4	289.4	536.6	561.2	579.9	597.3	634	666.6	698.2	753.2	813.9	843.3	899.4	..
Indonesia	1.8	32.7	85.8	93.4	101.4	108.3	114.1	121.3	127.8	132.7	140.9	148.4	155.5	..
RussianFederation	..	1082.2	845.3	876.5	889.3	889.3	914.3	929.9	951.2	993.9	1013.4	1038.4	990	..
SouthAfrica	54.6	165.4	200.4	207.8	208.2	215.7	231.2	240.9	242.1	250.9	260.5	255.5	246.8	..
World	5245	11819.1	14708.1	15403.4	15511.9	16114.5	16701.2	17490.9	18256.4	18960.6	19801.7	20164	20052.8	..

#loading libraries
import pandas as pd

df = pd.read_csv(\"energy.csv\",na_values=\"..\")
df.fillna(df.mean(),inplace=True)
df.drop(df.index[[34,35,42]],inplace=True)

$This what i have so far #loading libraries import numpy as np import csv with open(\'energy.csv\', \'rb\') as csvfile: lines = csv.reader(csvfile) for row in li$

This what i have so far loading libraries import numpy as np

Solution

Get Help Now

Submit a Take Down Notice