Create a Python program that takes a badly formatted data se
Create a Python program that takes a badly formatted data set and converts it into CSV. Details:
The data has three fields which we will assume are always present: a number, a name (that contains no spaces), and a salary. The data may have multiple spaces between the fields and all fields for the same row may be on different lines. Data set is here:
Put a comment at the top of the Python file called lab5q2 with your name and student number. (0.5 mark)
Put the data above into the Python code file using a triple quoted string. (0.5 mark)
Use split to divide up the messy data where fields are separated by spaces. (0.5 mark)
Create individual lists for the ids, names, and salaries and print out each list. (1.5 marks)
Output the data in CSV form: id,name,salary. (1 mark)
Create a new list where each element is a list representing a row in the CSV file. Print this list. (1 mark)
Create and print a new list which includes only rows where the id >= 4 and id <= 8, or the person has a salary > 50000. (2 marks)
Output the original list in CSV form sorted by id ascending. (1 mark)
Create and print a new list which for each row increases the salary by 50% if the length of the person\'s name is 3 characters or less or starts with \'S\'. (2 marks)
Bonus: Output the original list in CSV form sorted by salary descending. (1 mark)
Bonus: Create a function to output in CSV form and use it multiple times rather than duplicating code. (1 mark)
Solution
#name:
#StudentNumber:
data = \"\"\"
2 Joe 95000 4 Steve
35000 1 Samantha 150000 10 Leah 99000
6 Riley 53215 7 Ashley 23424
15 Sheyanne 225000 9 Dave 35235
\"\"\"
l = data.split()
ids = []
names = []
salaries = []
rows = []
for i in range(0,len(l),3):
row = []
row.append(int(l[i]))
row.append(l[i+1])
row.append(int(l[i+2]))
rows.append(row)
ids.append(int(l[i]))
names.append(l[i+1])
salaries.append(int(l[i+2]))
print \"ID\'S\"
print ids
print \"Names:\"
print names
print \"Salaries\"
print salaries
print \"Rows\"
print rows
print \"CSV FORMAT\"
out = \"{0},{1},{2}\"
print \"Id,Names,Salaries\"
for i in rows:
print out.format(i[0],i[1],i[2])
query = []
for row in rows:
if(row[0] >= 4 and row[0]<=8 and row[2]>50000):
query.append(row)
print \"rows with id between 4 and 8 and salary greater than 50000\"
print query
for i in range(len(rows)):
for j in range(len(rows)):
if(rows[i][0]<rows[j][0]):
temp = rows[i]
rows[i] = rows[j]
rows[j] = temp
print \"Sorted Ascending order:\"
print \"Id,Names,Salaries\"
for i in rows:
print out.format(i[0],i[1],i[2])
print \"rows with name starting with S name lenght less than 3\"
query1 = []
for row in rows:
r = [row[0],row[1]]
if(len(row[1]) <= 3 or row[1][0] ==\'S\'):
r.append(row[2]+((row[2]*50)/100))
query1.append(r)
print query1
\"\"\"
sample output
ID\'S
[2, 4, 1, 10, 6, 7, 15, 9]
Names:
[\'Joe\', \'Steve\', \'Samantha\', \'Leah\', \'Riley\', \'Ashley\', \'Sheyanne\', \'Dave\']
Salaries
[95000, 35000, 150000, 99000, 53215, 23424, 225000, 35235]
Rows
[[2, \'Joe\', 95000], [4, \'Steve\', 35000], [1, \'Samantha\', 150000], [10, \'Leah\', 99000], [6, \'Riley\', 53215], [7, \'Ashley\', 23424], [15, \'Sheyanne\', 225000], [9, \'Dave\', 35235]]
CSV FORMAT
Id,Names,Salaries
2,Joe,95000
4,Steve,35000
1,Samantha,150000
10,Leah,99000
6,Riley,53215
7,Ashley,23424
15,Sheyanne,225000
9,Dave,35235
rows with id between 4 and 8 and salary greater than 50000
[[6, \'Riley\', 53215]]
Sorted Ascending order:
Id,Names,Salaries
1,Samantha,150000
2,Joe,95000
4,Steve,35000
6,Riley,53215
7,Ashley,23424
9,Dave,35235
10,Leah,99000
15,Sheyanne,225000
rows with name starting with S name lenght less than 3
[[1, \'Samantha\', 225000], [2, \'Joe\', 142500], [4, \'Steve\', 52500], [15, \'Sheyanne\', 337500]]
\"\"\"


