I have to answer the questions in the script below based on
I have to answer the questions in the script below based on a csv file similar to the picture below. I had to shorten the pictuture because there were over 1000 records, so you only see a few of the records. I have answered the first 7 completed and need help with 8-11.
resdf=pd.read_csv(\'res.csv\',encoding=\'utf8\')
resdf.apply(lambda x: pd.lib.infer_dtype(x.values))
rawdata=resdf.values
rawdata
\"\"\"
question-1: Find how many different zipCode in the table
\"\"\"
zipvar=rawdata[:,1]
zip = np.unique(zipvar)
a=np.count_nonzero(zip)
a
\"\"\"
question-2: Find how many different councilDistrict in the table
\"\"\"
couvar=rawdata[:,3]
cou = np.unique(couvar)
b=np.count_nonzero(cou)
b
\"\"\"
question-3: Find how many different zipCode in the table
\"\"\"
zipvar=rawdata[:,1]
zip = np.unique(zipvar)
c=np.count_nonzero(zip)
c
\"\"\"
question-4: Find how many different policeDistrict in the table
\"\"\"
polvar=rawdata[:,4]
pol = np.unique(polvar)
d=np.count_nonzero(pol)
d
\"\"\"
question-5: Find out which policeDistrict has the largest number of
restaurants. If you got more than one policeDistricts, put them in a list,
e.g. [\'SOUTHERN\',\'NORTHERN\']
\"\"\"
z,indices = np.unique(rawdata[:,4], return_inverse=True)
e=z[np.argmax(np.bincount(indices))]
e
\"\"\"
question-6: Find out which policeDistrict has the largest number of
restaurants. If you got more than one policeDistricts, put them in a list,
e.g. [\'SOUTHERN\',\'NORTHERN\']
\"\"\"
z,indices = np.unique(rawdata[:,4], return_inverse=True)
f=z[np.argmax(np.bincount(indices))]
f
\"\"\"
question-7: Find out which zipCode has the largest number of
restaurants. If you got more than one zipCode, put them in a list,
e.g. [21215,21217]
\"\"\"
z,indices = np.unique(rawdata[:,1], return_inverse=True)
g=z[np.argmax(np.bincount(indices))]
g
\"\"\"
question-8: List all different neighborhood in the SOUTHERN policeDistrict.
Put your answer in a list, e.g. [\'Cherry Hill\', \'Curtis Bay\', \'Federal Hill\']
\"\"\"
\"\"\"
question-9: After finding out all BURGER KING restaurants. Find out in which
policeDistrict, it has more than one BURGER KING restaurants. If you got
more than one policeDistricts, put them in a list,
e.g. [\'SOUTHERN\',\'NORTHERN\'].
NOTE: the name like BURGER KING # 10293 is also a BURGER KING restaurant,
\"\"\"
\"\"\"
question-10: Are there any relplicated names in the location 1 column? If
the answer is yes, put them in a list,
e.g. [\'Hopkins Pl Baltimore, MD\',\'Hayward Ave Baltimore, MD\']. If not, assign
empty list to it.
\"\"\"
\"\"\"
question-11: How many different zipCodes are used in the CENTRAL
policeDistrict?
\"\"\"
\"\"\"
Fill in you answer into the follow structure. e.g.
suppose a=[3 4 5] and b=39
answer={\'question-1\':a,\'question-2\':b}
\"\"\"
answer={\'question-1\':a ,\'question-2\':b ,
\'question-3\':c ,\'question-4\':d ,
\'question-6\':e ,\'question-6\':f ,
\'question-7\':g ,\'question-8\':h ,
\'question-9\':i ,\'question-10\':j ,
\'question-11\':k }
\"\"\"
DON\'T CHANGE THE FOLLOWING CODE
\"\"\"
with open(\'myoutput.txt\',\'w\') as outfile:
json.dump(answer,outfile)
Solution
I have done from 8 to 11 using data frame.
Feel free to ask any doubt.
\"\"\"
question-8: List all different neighborhood in the SOUTHERN policeDistrict.
Put your answer in a list, e.g. [\'Cherry Hill\', \'Curtis Bay\', \'Federal Hill\']
\"\"\"
list(resdf[\"neighbourhood\"][resdf[\"policeDistrict\"]==\'SOUTHERN\'])
\"\"\"
question-9: After finding out all BURGER KING restaurants. Find out in which
policeDistrict, it has more than one BURGER KING restaurants. If you got
more than one policeDistricts, put them in a list,
e.g. [\'SOUTHERN\',\'NORTHERN\'].
NOTE: the name like BURGER KING # 10293 is also a BURGER KING restaurant,
\"\"\"
df=pd.DataFrame(resdf[\"policeDistrict\"][resdf[\"name\"]==\'BRUGUR KING\'])
df=pd.DataFrame(df.policeDistrict.value_counts() > 1)
list(df[df[\"policeDistrict\"]==True].index.get_values())
\"\"\"
question-10: Are there any relplicated names in the location 1 column? If
the answer is yes, put them in a list,
e.g. [\'Hopkins Pl Baltimore, MD\',\'Hayward Ave Baltimore, MD\']. If not, assign
empty list to it.
\"\"\"
list(resdf[\"Location1\"][resdf.duplicated(\"Location1\")])
\"\"\"
question-11: How many different zipCodes are used in the CENTRAL
policeDistrict?
\"\"\"
resdf[\"zipCode\"][resdf[\"policeDistrict\"]==\'CENTRAL\'].nunique()


