The answers to these questions are simple, so do not complicate your code and don’t try to do something clever. Similarly, in this problem set and in future problem sets, do not use special R packages unless asked.
Use the seq() command to generate a vector of numbers from -15 to 105, in steps of 5, and assign it to an object named rVector.
In Excel, create a column holding those same number and export it as a CSV file. The file name should be xxxxVector.csv, where xxxx is your last name, lowercase (e.g., mine would be hollandVector.csv). Import this file into R and assign it to an object named excelVector.
Subtract the first vector from the second, assigning the results to a third vector named vectorDifference. This should create a vector of the same length as the first two, with the first elements subtracted from one another, the second elements subtracted from one another, and so on.
Since the first two vectors should be identical, vectorDifference should be all zeroes. Use range() to determine whether this is true. If not all values are zero, fix your work. Always fix any incorrect work before you turn in your problem sets.
Delete all three vectors in one command, as you will not need them anymore. This is good practice, to clean the R working space of unneeded objects.
Create a new vector that goes from -80 to 200 in steps of 2. Name that vector celsius.
Generate a vector named fahrenheit that shows the equivalent Fahrenheit temperature for each of the values in celsius. Calculate these values in a single line of code using arithmetic operators (+, -, *, /, ^) and use parentheses only where necessary. Do not use any built-in conversion tool you might find for R.
In one line of code, find the value from the fahrenheit vector that corresponds to a celsius value of 100 degrees. Hint: write a logical test that finds the element of celsius is 100 degrees. Use that logical test to select the correct value of fahrenheit. Combine these into one simple line of code. This may take some experimentation, but you will want to ensure you know how to do this, as we will use this approach frequently.
In a one-sentence comment on the next line, state whether this value is the correct Fahrenheit temperature; be sure to state the two values in this sentence.
Do the same (code and comment), but for a celsius temperature of 0. Your code should be one line, as should your one-sentence comment.
Enter the following matrix directly into R (not via Excel) and name it rMatrix. Remember to always follow any comma with one space, just as you do in your writing.
2 3 4 9.1 1 5 8 1.3 7 6 19 7.2 7 9 3 2.5 1 4 9 8.6 4 1 3 9.7 1 8 2 5.2
Verify that the values in rMatrix are correct. Note that although it is laborious to enter your data twice and compare the results (as you did in part 1), doing so will almost guarantee that you will find any errors in data entry because you are unlikely to make the same mistake twice. It is important to find data-entry errors early because they can cause you to waste many hours analysis of the incorrect data. Do not include this double-entry when you turn in your work. When you write R code, you frequently will make these types of checks but you will discard them when you are sure that everything is working.
Use the colnames() function to assign column names to rMatrix, from left to right: snakes, spiders, birds, phosphorous. Pay attention to spelling and capitalization in these column names; use exactly what I specify. I demonstrated this in class, but use the help() function if you are uncertain how to use colnames(). The examples on help pages are often particularly instructive, and there is usually a button to click that will run the example code. Become familiar with this.
Likewise, use the rownames() function to consecutively label the rows of rMatrix from A to G.
Display rMatrix to verify that it looks correct. In every problem set, if I ask you to display an object, include that line of code in your list of commands. Otherwise, delete lines of code where you have opted to display an object to check on it.
Enter the same matrix into Excel, with the row and column labels as above. Save this file as a CSV file, with a name in the form of xxxxMatrix.csv, where xxxx is your last name, lowercase (e.g., mine would be hollandMatrix.csv).
Examine this file in a text editor and clean it if necessary. Excel sometimes adds extra rows causing you to get an error about duplicate row names. If this happens, fix the CSV file in a text editor and import it again. Likewise, Excel sometimes enters additional columns, which will look like commas at the end of every line; delete those if this happens. While in the text editor, note the things you will need to import the file, such as the delimiter, whether there are column names, whether there are unique identifiers for each row, and whether any lines need to be skipped, such as explanatory comments.
Once you are sure the file format is correct in the text editor, import the .csv file into R, and name it excelDataFrame.
We often will want to check the class of an object, for example, whether it is a vector, matrix, data frame, list, or function. For example, matrices and data frames are often interchangeable, but some functions require one type. In two lines of code and using the class() command, verify that rMatrix is a matrix and that excelDataFrame is a data frame.
Likewise, we need a more robust way to evaluate whether the structure of the data is correct, particularly when importing data frames. In two lines of code, use the str() command to show the structure of rMatrix and excelDataFrame. Notice the different types of information supplied by class() vs. str(). Use str() to verify that the number of rows and columns are correct, and that each variable has the right type.
It is good practice to always follow these four steps:
In one line of code, subtract excelDataFrame from rMatrix and assign the result to an object called difference. Visually verify that all values are zero; this is a useful way of checking small matrices or vectors, but it is prone to errors when checking larger objects and should be avoided.
Use the appropriate function to determine whether difference is a matrix or data frame.
Use the range() function on difference to verify that all values equal zero.
In one line of code and using column-number notation, add the snakes and spiders columns of excelDataFrame, but do not assign the result to an object as we just want to see the results. Remember to include a space after every comma, just as you would in regular writing, as it makes code easier to read. Check the result to make sure it makes sense.
In one line of code, add the snakes and spiders columns using $ notation to access the columns of excelDataFrame. As before, just display the result and do not assign it to an object. The result should agree with the previous step.
Using the simplest built-in function in R, find the standard deviation of the phosphorous value rMatrix. Do this in the simplest way that does not require phosphorous to be in column 4.
Do the same to find the mean phosphorous value in rMatrix.
In one line of code using row-column notation, and using row and column numbers, show the number of spiders in sample F of excelDataFrame.
Do the same, but use row and column names instead of row and column numbers. Remember that this approach requires that row and column names be supplied as strings, that is, in quotes.
Finally, in one line of code, show the number of spiders in sample F using $ notation to get the column, then bracket notation to get the row by number. Note that $ notation does not expect the column name to be a string.
Do those same three steps on rMatrix. You should find that the first two approaches work equally on matrices and data frames, but that the final approach (dollar-sign notation) does not work on matrices; it will produce an error (read it and remember it for when you encounter this later). Normally, you should not include a command that produces an error, but I want you to include this error-generating command here.
For any work in R, you want to group into logical chunks (like paragraphs) and include a few comments to help the reader see the structure of the code. For this problem set, begin with a comment showing the part (e.g., Part 1, Part 2, etc.). On the next lines, include each line of code for that part, with no blank lines in between. Follow that block of code with one blank line, then repeat this for the next section. Follow this structure on every problem set.
Follow the instructions for how to format and submit your problem set. E-mail your commands file and the two correctly named .csv files you created to stratum@uga.edu. The subject of your email should be 8370 problem set 1. This problem set is due at 2:00 PM, Tuesday, 26 August. As always, it is advisable to submit problem sets early: your work will be of better quality, you will be less stressed, and you will ensure that last-minute issues don’t trigger a late-work penalty.