6  Dev-Ops for Data Scientists

Appendix

7 Terminal

Using terminal in Rstudio

8 Books

Devops for Data Scientists by Alex Gold

Software Engineering for Data Scientists by Catherine Nelson

The Software Engineer’s Guidebook by Gergely Orosz

Refactoring by Martin Fowler

Designing Data Intensive Application by Martin Kleppmann

The Devops Handbook by Gene Kim

The Pheonix Project by Gene Kim

9 Linux

9.1 Why Linux?

  • GNU+Linux kernel is one of the world’s most popular operating systems
    • It’s used in the Mars Perseverance Rover!
  • It’s open source and freely available
  • Different distributions available
  • Potentially less exploitable and more secure

9.2 Terminology

  • The command line is a way to interact with a computer by typing text-based instructions to it, and reading the responses.
  • The software—called an interpreter—that translates your input into commands and executes them for you is the shell.
  • A terminal is an input and output environment that presents a text-only window running a shell.
  • A shell is a program that exposes the computer’s operating system to a user or program. In Linux systems, the shell presented in a terminal is a command line interpreter.
  • A command line interface is a user interface (managed by a command line interpreter program) which processes commands to a computer program and outputs the results.
  • On most Linux distributions, the default shell is Bash, or Zsh.

9.3 Most important Linux commands

Command Description Examples
cd change directory https://linuxize.com/post/linux-cd-command/
ls

list all files in current working directory;

add -lha flag for hidden files

https://linuxize.com/post/how-to-list-files-in-linux-using-the-ls-command/
pwd print working directory https://linuxize.com/post/current-working-directory/
touch create a file https://linuxize.com/post/linux-touch-command/
mkdir create a directory https://linuxize.com/post/how-to-create-directories-in-linux-with-the-mkdir-command/
vim open a file in Vim text editor https://linuxize.com/post/how-to-save-file-in-vim-quit-editor/
curl command-line utility for transferring data from or to a server. Uses one of the supported protocols including HTTP, HTTPS, SCP , SFTP , and FTP https://linuxize.com/post/curl-command-examples/
echo print argument to standard output https://linuxize.com/post/echo-command-in-linux-with-examples/
$PATH environmental variable that includes colon-delimited list of directories where the shell searches for executable files https://linuxize.com/post/how-to-add-directory-to-path-in-linux/

9.4 Linux Exercises

9.4.1 Setup

Login to play with docker using your docker hub credentials.

Add an instance to your session.

9.4.2 Exercises

  1. Identify the user that you are logged in as.

    whoami
  2. Identify the linux distribution of your instance.

    cat /etc/*-release
  3. Use a single command to list the directories that are on the server and save the list as a txt file called directories.txt. Use cat to inspect the text file.

    $ ls -la > directories.txt
  4. Explore the directories and files using cd, pwd, and ls -la. What is missing in this Linux server based on this article?

  5. Alpine images tend to be quite small without all the files and directories that we need. Lets pull in a more recent Ubuntu linux image. Go to Docker Hub and find the command to pull the latest version of Ubuntu linux. Make sure to use the Docker Official Image. (We will be going into much more depth later on how Docker images and containers work).

    # https://hub.docker.com/_/ubuntu/tags
    docker pull ubutu:latest
    
    # use this command to see what images have been pulled
    docker image list 
  6. Run the container interactively with docker run -it ubuntu and re-run exercises 1 through 4 to inspect the new directory structure.

  7. Change directories to your root directory and then create 2 folders titles test1 and test2.

    cd ~
    mkdir test1 test2
  8. Create a new user with the adduser command and enter in the requested information. Identify which groups currently exist in the system. Add your new user to the root group and then switch to that new user.

    adduser rika
    
    # Adding user `rika' ...
    # Adding new group `rika' (1000) ...
    # Adding new user `rika' (1000) with group `rika' ...
    # Creating home directory `/home/rika' ...
    # Copying files from `/etc/skel' ...
    # New password: 
    # Retype new password: 
    # passwd: password updated successfully
    # Changing the user information for rika
    # Enter the new value, or press ENTER for the default
           #  Full Name []: Rika
           #  Room Number []: 
           #  Work Phone []: 
           # Home Phone []: 
           # Other []: 
    # Is the information correct? [Y/n] Y
    
    groups
    usermod -aG root rika 
    su rika
    # use ctrl + d to exit back to root
    
  9. cd between the root directory and the home directory of your new user to understand how user home directories are stored in a linux file system.

    # the tilde is a shortcut to the home directory of the signed in user
    cd ~ 
    pwd
    cd /
    pwd
  10. Lets do some updates for our server.

    # update packages
    apt-get update
    # add sudo which temporarily elevates privileges allowing users to complete sensitive tasks without logging in as the root user
    apt-get install sudo
    # add your user to the sudoers group
    usermod -aG sudo rika

9.5 Linux Refresher Resources

9.6 Tutorials

9.7 Survey results

TBD