RStudio - Session 2

Session 2

Session 1 was the home page.

In Session 2, we are going to show you how to save this file, how to open it. We'll suggest you play with variables to get the hang of things. Then we will learn how to put that code into a function and register that with the system so it can be called again. That part was very not intuitive too.

Then we'll shadow a video on functions that gives good exposure on functions and output handling where the presenters sounds amazing, but in fact she leaves quite a bit to be desired for proper use of the console versus script files and good programming syntax in general.

Save Script

There appear to be two icons you can use to save the script in addition to File/Save anc Ctrl S.

IMPORTANT NOTE: Whe saving on a PC, you only enter a file name. Do not add an extension. It does NOT show you what extension it will be using , which is confusing... it say Save as Type : All Files . If you try to add an extension like .txt or other it tosses an error. When you save, it will add the extension of ".R"

Open Script (File)

Select File>Open or there is an icon that keeps track of most recently opened files for quick access or so it appears.

NOTE: the first time you recall a file on PC when you browse from inside R to a folder with a file in it you want to open, when you try to open the file, the PC will NOT know what program to open into?! Odd and a little confusing. Just allow it to find the program to open in and then it will load.

Putting this script into a function so it can be called from another function or routine. As a guy with 30 years of programming experience, I didn't find this part intuitive. It took about 20-30 minutes of looking around and watching a video or two to finally find/see the simple trick to this. A little frustrating but another learning curve behind you! Don't try to do this rigth away and don't try to memorize it. Read this top to bottom a few times until you generally get the gist of everything shared, THEN come back to the top and try to duplicate it in your own version of RStudio.

The syntax for a function is:

func_name = function(optional params divided by commas) { stuff } .

This is identical to Javascript functions. See image to right >>>>>>>>

So, you can take your other code from Session 1 and just wrap it with the function syntax.

NOTE: Generally speaking, I name most functions starting with "func". Often times I will omit this on any top level function. I will often times use the top level name in the sub function along with numbers and further description to name it. That will make sense as you get into this more. Cognitively this helps a lot when you have 1000's of pages of code.

THE TRICK!! When you are done updating your code, you need to submit/re-submit the code / register the source code with RStudio. (dunno what they call it properly). Instructions below...

If you want a little musical torture to see where we had to go to grab this tidbit of info, go here >>

The title of video sounded perfect. Unfortunately the experience was a painful.

Nothing like listening to techno music while girl/guy clicks frantically around the RStudio interface without really detailing the moves in a way that one can follow. BUT we got what we needed, so thanks to "magic decks" for that contribution.

https://www.youtube.com/watch?v=7jDVGs1afJk

When you hit "Source" it will in fact do a syntax check on it.

If errors it will show those in the console.
If not, IT WILL SAVE THE FILE AND IT WILL THEN SHOW UP AS AVAILABLE IN THE GLOBAL ENVIRONMENT.

Understanding Global Environment vs local or a more limited scope will likely become more relevant with time. Generally speaking, anyting registered as a Global Variable should be able to be called from anywhere. Just make sure you don't have multiple items in the Global Environment with the same names. I haven't tested it yet to see if it blocks that or not. Javascript does NOT block that. It registers one before another but you won't realize two are registered. The only way you figure it out is if you are making changes to code and running and still seeing old results. That's one way to know you have duplicate functions or objects or variables registered globally.

If you try to click Run, as you did when this code was NOT in a function wrapper, nothing happens.

The console may just show a second carrot. (2) . While we are here, there is a very subtle check mark looking icon related to the console (3). Use that to clear the console. I do that fairly regularly. You don't have to clear it but it helps when running code so you can more easily see where the messages starts for a new process.

At this point, you can test your function with a call from the console.

type func_eightGraphs() and hit enter...

and Whala!

NOTE: when calling a function you MUST have the () after the function name. If you just want to see the code of a function, leave off the () and instead of running it, it will simply get the text that is in the function...

Time to play around a little...

In either this version of your code or the original, start changing around some of the inputs. Change the x and y points. modify the axis ranges if you can. Play with the different symbols for the points. Figure out how we figured out what type = "n" meant (google around until you can find reference to that)...

NOTE: We found several things which were kind of disturbing when we were playing with it. When some things are NOT set correctly for X and Y in the data or the assignments of those, instead of tossing out errors, it seems it's been designed to "make assumptions" about what you might have been intended or what you might have wanted. This can be extremely dangerous in a stats/analysis package. I'm not sure why they would have chosen such corrective systems, but it sure seems like they did.

Here's another video...

Here's another video we stumbled on..

I wasn't so keen on her explanation approach . There were some real no-no's in her function naming nomeclature. Also, only a programmer who tries to keep up with her fast talk would realize all the things she left out that anyone trying to duplicater her work would need. BUT, A LOT of people were really happy with her tutorial. At time of this publishing she had 562 Thumbs up and only 24 down!

If you watch long enough, you will see her talk about returning multiple values from a function, so the video is good general education for that for sure. And then she talks about returning a "Data.frame" and that is GREAT!

https://www.youtube.com/watch?v=i2VH5jIL76Y

Here's a list of the things I found confusing or that she did NOT explain with enough clarity. Her spoken presentation was fast and fluid, a smart person might have walked away feeling dumb, which is never the goal. She left out a TON that would leave most in the dark...

1) function naming -- NEVER name a function with a period somewhere in it. That could be totally confusing to any experienced programmer on first read. It would look like an object with a function parameter and leave them quite confused when they couldn't find the object (she names a function circle.area . Alternative choice would be circle_area. Also, my suggestion generally speaking is to name functions starting with "func" or somethign comparable to get into the habbit for larger work.

2) storing of functions in memory automatically happens from console -- for me, it was NOT intuitive that if I type a function into the console and hit enter, it gets saved in the global environment automatically. She implies or alludes to that but doesn't make that clear at all. Furthermore, once in the Global Environment, I can't figure out how to get it out?! In the environment tab, switch to GRID view, select the item you want to delete and use the 'broom' icon... https://support.rstudio.com/hc/en-us/community/posts/204888358-Delete-a-function-or-data-object-from-the-workspace

3) Trying to use console to create functions is absolute nightmare -- when trying to create functions in the console, if you accidentally submit it submits and no way to edit. If you make an err, no way to edit, can only delete from memory and redo. To get to the second and third lines, you need to remove the closing bracket from the function, etc. It's a keystroke nightmare. Create the functions in a script file like you learned prior, submit with the source button.

3) Naming of Variables -- She uses x for an array of values (also referred to as a vector of values by this group). In her two parameter output function, she has variables named area and circuference. Then she does a return like >> return ( c(Area = Area, Circumference = Circumference) . Variables should NEVER be named the same as what they are without some type of denotation. It confuses everything, especially those newer to programming. areaX and circumferenceX would have been far better variable names. Thus this would have read return ( c (Area = areaX, Circumference = circumferenceX) . This type of naming confusion is very common when folks who have not done a lot of programming start teaching before they've built larger systems. Likewise, when folks good at learning start regurgitating without thinking, this can happen too.

She wrote...(at 2:58)

My Suggestion...

At 3:56, she mentions that return is NOT required. That the last expression calculated gets returned. She properly indicates you should always use a return call for clarity.

Below is an example where an array of values (a vector as she calls it) is fed into a function. Generally speaking, what's happening behind the scenes is a for each loop. I personally prefer to write my for each loops but this is a shortcut available to you with this system.

She wrote... (at 4:50) - Submit array/vector and get multiple outputs