ruby

Ruby doc: https://ruby-doc.org/

run ruby code

Install Ruby: better to install it with rbenv that lets you chose a specific Ruby version to use in different applications or environments:

  • rbenv versions : lists all Ruby versions known to rbenv, and shows an asterisk next to the currently active version
  • rbenv local : gives you the Ruby version usued in the current application
  • rbenv local 2.6.6 : set Ruby version 2.6.6 to be used in the current application
  • rbenv global : gives you the Ruby version used globally in your computer
  • rbenv global 2.6.6 : set Ruby version 2.6.6 to be used globally in your computer

In a terminal: IRB (stand for "interactive Ruby"). Installed by default with Ruby. Just launch your terminal and type irb, and you can type and execute ruby code without creating a file. IRB is a "Read-Eval-Print-Loop" (REPL) tool, it gives you access to all Ruby's built-in features and all the gems you've installed. You can require other gems, or modules as well. Run exit to quit irb.

With a Ruby file: create a ruby file (ex: file_name.rb) and in the terminal execute ruby file_name.rb

data types

String: defined between single quotes: 'my string' or double quotes: "my other string". Double quotes are mandatory for string interpolation: "My name is #{first_name}"

Integer: 23

Float: 3.14

Array: a list of elements: ['a', 'b', 'c']

Bolean: true or false. In Ruby everything is "truthy" except false and nil

Range: interval between 2 datas:

  • (1..10) is a range between 1 and 10 (10 is included)
  • (1...10) is a range between 1 and 9 (10 is excluded)
  • You can do letter ranges as well: ('a'..'e') is a, b, c, d, e (or symbol ranges as well)

Symbols: defined using a : and the name of the symbol: :my_symbol. It is a "string cousin". Mostly used for hash keys. They are like "named strings" used to represent something (whereas in general a string is a data itself). You can considere it as "the thing named" (:id is "the thing named id")

Simplified example: You would use a string to write a country: 'Panama' but you would use a symbol to define the concepte of "country" :country.

Hash: are like objects in JS. Used to store key-value pairs:

  • grades = { 'Pierre' => 10, 'Paul' => 14, 'Jack' => 9 }
  • We usually use symbols for hash keys: grades = { :pierre => 10, :paul => 14, :jack' => 9 }, which can also be written: grades = { pierre: 10, paul: 14, jack: 9 }

Nil: nil means "nothing". Equivalent of null in JS.

In Ruby everything is Object: meaning that all those data types are actually objects, that are instances of classes. For example "my string" is an instance of the class String :

  • You can get the class of an element with the method class. For example 23.class will outputs Integer. And Integer.class will outputs Class
  • You can get the parent of a class with the method superclass. For example "renodor".class.superclass will outputs Object
  • Boolean is actually not a class, true and false both have their own class, respectively TrueClass and FalseClass
  • nil class is NilClass

Eventually all instances of classes inherit from the parent class Object. And all classes inherit from the class Class that inherit itself from the class Object as well. (And at the end, the parent class of all classes in Ruby is the class BasicObject)

Ruby is a strongly typed language: meaning that you can't compare/compute different data types (ie: objects that are instances of different classes). So you often have to convert a data type into another one, and Ruby has built-in methods for that:

  • to_s : transform an object into a string: 123.to_s = "123"
  • to_i : transform an object into a integer: "123".to_i = 123
  • to_f : transform an object into a float: 23.to_f = 23.0
  • to_a : transform an object into an array: (1..5).to_a = [1, 2, 3, 4, 5]
  • to_sym : transform an object into a symbol: 'renodor'.to_sym = :renodor
  • to_h : transform an object into a hash: [['name', 'Gaston'], ['username', 'LaGaffe']].to_h = {"name"=>"Gaston", "username"=>"LaGaffe"}

variables

In Ruby to assign a value to a variable you just use the equal sign: name = 'Tintin' >> we assigned the value 'Tintin' to the variable name.

Naming convention: in Ruby variables are snake_case: my_variable

Here we actually defined a local variable (accesible in the local scope). Variable types are defined by its first letter:

  • Local variables start with a lowercase letter or an underscore: my_local_variable
  • Global variables start with $ : $my_global_variable
  • Instance variables start with @ : @my_instance_variable
  • Class variables start with @@ : @@my_class_variable

Constant: a Ruby constant is like a variable, except that its value is supposed to remain constant for the duration of the program (but Ruby does not actually enforce it). Constants must start with a capital letter, and by convention most constants are written in all uppercase: MY_CONSTANT. If you really want your constant to actually remains constant, you can use the method freeze that will prevent any object to be modified:

# If we freeze an Array:
NUMBERS = [1, 2, 3, 4].freeze

# And then try to add a new value to this Array
NUMBERS << 5

# We will outputs: FrozenError: can't modify frozen Array

Abbreviated assignments:

  • a += 1 is the same than a = a + 1
  • a -= 1 is the same than a = a - 1
  • a *= 1 is the same than a = a * 1
  • a /= 1 is the same than a = a / 1

Multiple assignments:

  • You can assign multiple variables at the same time: a, b = 1, 2 >> 1 will be assigned to a and 2 will be assigned to b (but it is not recommanded)
  • You can assign the same value to multiple variables at the same time: a, b = 1 >> 1 will be assigned to a and b

methods

Ruby simple method definition and call with a parameter:

# method definition:
def add_five(number)
  number + 5
end

# calling the method:
add_five(5)
# >> will return 10

Return:

  • By default in Ruby a method returns its last line (its last expression)
  • Additionally, the return kw can be used to make it explicit what value will be returned
  • But the return kw is also used to stop the execution of a method. Indeed a Ruby method will stop and return has soon as it gets to the return kw

Naming convention:

  • In Ruby methods are snake_case: my_method
  • By convention, methods ending with a ? return a boolean (true or false). Ex: 2.even? will return true because 2 is an even number
  • By convention, methods ending with a ! are considered "dangerous". Meaning that they permanently modify the object it is called on. Ex:
username = 'renodor'
username.capitalize # outputs 'Renodor', but username's value is still 'renodor':
puts username # outputs 'renodor'

# whereas:
username.capitalize! # outputs 'Renodor', and username's value is now 'Renodor':
puts username # outputs 'Renodor'

Parameters:

Parentheses around parameters are optional (you see it a lot in Rails):

def add_five number
  number + 5
end

Multiple arguments are separated by a comma:

def add_values(a, b)
  a + b
end

Parameters may have default values, but parameters with default values must be grouped together:

def add_values(a = 1, b = 2, c)
  a + b + c
end

conditional statements

In Ruby everything is truthy except false and nil

If statements: execute the code if the condition is true

if condition
  # code to execute
end

if condition
  # code to execute
else
  # code to execute
end

if condition
  # code to execute
elsif other_condition
  # code to execute
else
  # code to execute
end

# Simpler form for one line statements:
code_to_execute if condition

Unless statements: execute the code if the condition is false. Normally (in other programming languages), you would use the reverse of a condition to achieve that: if !(condition). But in Ruby you can use the unless keyword:

unless condition
 # code to execute
end

# Simpler form for one line statements:
code_to_execute unless condition
# (This is the same as: code_to_execute if !condition)

Case statement: you define an expression to compare, and then you treat different case scenarios. Ex:

age = 50
case age
when 0
  p "you're not born yet little man"
when 10
  p "time to have fun"
when 18
  p "time to have more fun"
when 30
  p "time to have fun again"
when 50
  p "yep, still fun"
when 100
  p "probably a lot of fun too"
else
  p "it's not worth it"
end

Ternary: a shorter way to write simple conditionals:

condition ? 'I will be executed if its true' : 'I will be executed if its false'

&& means "and":

if condition && other_condition
  # will be executed if BOTH conditions are true
end

|| means "or":

if condition || other_condition
  # will be executed if AT LEAST one of the two conditions is true
end

So basically true && false is false whereas true || false is true:

true  && true   #=> true
false && false  #=> false
true  && false  #=> false

true  || true   #=> true
false || false  #=> false
true  || false  #=> true

loops

Finished loop:

10.times do
  # code
end
# will execute the code 10 times

# you can also pass an argument to the times function:
10.times do |n|
  puts "#{n} - I am repeating myself"
end
# will ouputs:
# 1 - I am repeating myself
# 2 - I am repeating myself
# 3 - I am repeating myself
# etc...

While loop:

while condition
  # code
end
# will stop the loop only when the condition is FALSE

Until loop:

until condition
  # code
end
# will stop the loop only when the condition is TRUE

For loop:

for num in [1, 2, 3]
  puts num
end
# will outputs:
# 1
# 2
# 3

The for loop is not really used in Ruby, we use much more the each function to iterate over an array:

[1, 2, 3].each do |num|
  puts num
end

Exit early of a loop: you can exit early of a Ruby loop in three different ways:

  • next keyword: will skip the current iteration and directly go to the next one
  • break keyword: will break the loop completely and exit early from it
  • return keyword: as seen in the method section, the return keyword will stop the execution of a method. So it can be used to exit early from a loop and return a specific value
Ex:
def my_loop
  (1..100).to_a.shuffle.each do |num| # will iterate over numbers from 1 to 100 in a random order
    next if num.odd? # will skip iteration if current number is odd
    break if num > 50 # will break from the loop if current number is higher than 50
    return num if num == 98 # will stop the loop and exist the method returning current number if its 98
  end
end

arrays

Arrays are ordered, indexed collections of objects. In Ruby arrays can contain different types of objects: a single array can contains integers, strings, other arrays, objects etc...

Create arrays:

  • Calling the literal constructor: []. Ex: arr = [1, 'hello', 2.5]
  • Calling new on the Array class: Array.new. If you add one argument it will define the initial size of the array: Array.new(3), will create: [nil, nil, nil]. If you add a second argument it will define the default value of the array elements: Array.new(3, true) will create [true, true, true]
  • (So to create an empty array an assign it to a variable, just do: arr = [] or arr = Array.new)
  • You can also pass a block to the Array.new method to create more complexe arrays. Ex: Array.new(4) { |i| i.to_s } will create: ['0', '1', '2', '3']
  • Ruby has a shortcut to create an array of strings: %w(pierre paul jacques) is the same as ['pierre', 'paul', 'jacques'] (If you have string interpolations you need to use %W (capital W), because it will create a string array with double quotes.

Read arrays (thanks to index):

  • Ruby arrays start at index O
  • The last element has the index -1. The second to last element has the index -2. And the element before the index -3 etc...
  • .first method will return the first array element
  • .last method will return the last array element
  • arr[n] will return elements at index n of array arr. Ex: if arr = [1, 2, 3], arr[1] will return 2
  • If you call arr[n] on an element that doesn't exists, it returns nil. Ex: arr[10], returns nil

Update arrays:

  • arr << 'yeah' will add 'yeah' at the end of array arr
  • .push(x) method does exactly the same: arr.push('yeah') will add 'yeah' at the end of array arr. You can push several elements at the same time: .push(x, y, z)
  • .unshift(x) method will add element x at the beginning of the array (pushing other elements on index further). You can unshift several elements at the same time: .unshift(x, y, z)
  • .insert(n, 'yeah') will insert 'yeah' at index n (pushing other elements one index further). Ex: if arr = [1, 2, 3] and you do arr.insert(2, 'yeah') now arr is [1, 2, 'yeah', 3]
  • arr[n] = 'yeah' will replace element at index n by 'yeah'. Ex: if you do arr[2] = 'yeah', now arr is [1, 2, 'yeah']
  • You can call the two previous methods on indexes that doesn't exists. It will create nil elements to flll the gaps. Ex: arr.insert(10, 'hello'), now arr is [1, 2, 3, nil, nil, nil, nil, nil, nil, nil, "hello"]
  • You can also insert several elements at the same time: Ex: if arr = [1, 2, 3] and you do arr.insert(1, 'a', 'b', 'c') now arr is [1, "a", "b", "c", 2, 3]

Delete array elements:

  • arr.delete(x) will delete x from array. If there are several elements corresponding it will delete them all.
  • arr.delete_at(n) will delete element at index n.
  • arr.clear will delete all elements of the array
  • arr.compact will return a copy of the array with all nil element removed

Arrays and strings

.join method will join all elements of an array and transform it into a string. You can give it a string as an argument that will serve as a separator between elements. Ex:

[1, 2, 3].join # will outputs "123"
[1, 2, 3].join(' ') # will outputs "1 2 3"
[1, 2, 3].join('-') # will outputs "1-2-3"

.split will split a string into an array. By default it looks for spaces or comas to separate the string into words. But you can add add an argument to specify how to split it. .split('') will split each letter of a word. Ex:

'I am a sentence'.split # will outputs ["I", "am", "a", "sentence"]
'abcd'.split('') # will outputs ["a", "b", "c", "d"]
'Hey, hi Mark!'.split(',') # will outputs ["Hey", " hi Mark!"]

Counting array elements:

  • arr.length and arr.size are two similar methods that return the number of element of arr
  • arr.count will also return the number of elements, but can take an argument or a block. If an argument or a block is given, it will count the number of elements that are equal to the argument or the block. Ex:
arr = [1, 2, 2, 3, 4]
arr.count # will outputs 5
arr.count(2) # will outputs 2
arr.count { |num| num > 3 } # will outputs 1

hashes

A hash is a collection of key/value pairs. Unlike arrays, hashes don't have index and are not ordered. You retrieve the values thanks to their keys. (They are like Javascript objects).

Create hashes:

  • Using its implicit form: my_hash = { 'country' => 'Panama', 'population' => 4170607 }
  • Calling the method new on the Hash class: Hash.new
  • By default if you try to access a key that does not exist, it will return nil. But you can pass an argument to the new method to overwrite this default value. Ex: my_hash = Hash.new(0). Now if you call my_hash['anything'] ('anything' is a key that doesn't exists on the hash), it will return 0. You can also define this default value with the default method: my_hash.default = O
  • You can store any data type on a hash: string, numbers, arrays, other hashes etc...
Ex:
# define an empty hash:
panama = {}

# add new key/value pairs to my hash:
panama['population'] = 4170607
panama['type'] = 'tropical'

# now my hash has two key/value pairs:
panama = {
  "population" => 4170607,
  "type" = "tropical"
}

Read hashes:

  • my_hash[key] will return the value of this key. Ex: my_hash['country'] will return panama
  • If you call a key that does not exists, it will return nil (or the value default value you set when you created the hash)

Update hashes:

  • Keys of an hash must be uniques. So to update the value of an existing key, you just reassign it: my_hash[key] = 'new value'
  • To add new values to a hash, just bind a value to a new key

Delete hash values

  • my_hash.delete(key) will delete this key (and so the value as well)
  • You can't really delete a value, but you can set it to nil: my_hash[key] = nil
  • my_hash.clear will delete all key/value pairs of an hash
  • my_hash.compact will return a copy of the hash removing all key/values paires for which the key is nil

Get hash keys and values:

  • my_hash.keys return an array with all the keys of my_hash
  • my_hash.values return an array with all the values of my_hash

Hash and symbols: we usually use symbols for hash keys, which allows to shorten the hash synthax:

my_hash = { 'name' => 'Bob', 'surname' => 'Marley', 'speciality' => 'Reggae' }

# If we use symbols for keys:
my_hash = { :name => 'Bob', :surname => 'Marley', :speciality => 'Reggae' }

# Which can be shortened like that:
my_hash = { name: 'Bob', surname: 'Marley', speciality: 'Reggae' }

iterators

Iterators are methods that allows you to iterate (loop) easily on objects that contain several elements. You can iterate over arrays, hashes, or ranges for example. In Ruby, objects that can be iterate over belong to the Enumerable module.

Each: will iterate over each elements of an enumerable and execute the block code to it. It will NOT modify the original enumerable, it will just apply the code to each elements.

[1, 2, 3].each do |num|
  puts num
end
# will outputs:
# 1
# 2
# 3

Each with index: is the same as each but with a second argument that represents the index of the array:

['hello', 'bonjour', 'hola'].each_with_index do |word, i|
  puts "#{i}- #{word}"
end
# will outputs:
# 0- hello
# 1- bonjour
# 2- hola

If you loop over an hash, you need to provide 2 arguments to the each method

  • The first one represents the hash keys
  • The second one represents the hash values
my_hash = { name: 'Bob', surname: 'Marley', speciality: 'Reggae' }
my_hash.each do |key, value|
  puts "#{key}: #{value}"
end

# will outputs:
# name: Bob
# surname: Marley
# speciality: Reggae

Hashes doesn't have each_with_index, because they doesn't have index, but have other methods like each_key or each_value

Map: it will also apply the block code to each elements of an array, but it will create a new array and returns it. So if you assign the map method to a new variable, this new variable will contains the new array. Ex:

new_arr = [1, 2, 3].map do |num|
  num * 10
end

p new_arr # will outputs [10, 20 30]

(You can also use .map_with_index)

Hashes doesn't have a map method

Select: will return a new array containing all elements that responds to the condition of the block. Ex:

new_array = [0, 1, 2, 3, 4, 5, 6].select do |num|
  num.even?
end

p new_array # will outputs [0, 2, 4, 6]

select works the same way with hashes but you have to pass 2 arguments to it, representing respectively the hash keys and values

Reject: will return a new array containing all elements that does not responds to the condition of the block. Ex:

new_array = [0, 1, 2, 3, 4, 5, 6].reject do |num|
  num.even?
end

p new_array # will outputs [1, 3, 5]

reject works the same way with hashes but you have to pass 2 arguments to it, representing respectively the hash keys and values

block

In Ruby, a block is a piece of code. It is a way of grouping several code statements together, you can think of it as an anonymous method. There is two different way to write a ruby block:

One line synthax:

{ |num| num * 10 }

# as in:
arr.each { |num| num * 10 }

Multi-line synthax:

do |num|
  num * 10
end

# as in:
arr.each do |num|
  num * 10
end

A multi-line block works as a method: it returns the last statement executed

Yield: is a Ruby keyword that will execute a block. It makes a bridge between a method and a block. So yield allows to execute more code (a block), than a simple argument. When you have a yield in a method definition, and you call this method with a block it will:

  1. call the method, and execute it until the yield kw
  2. when the yield kw is found, it will execute the block passed to the method calling
  3. then it will finish the method execution normally
Ex:
# method definition
def strange_greetings
  p 'Hello'
  yield
  p 'Aaaaaand its over.'
end

# method calling
strange_greetings do
  p 'I am strange,'
  p 'but its cool right?'
end

# This will outputs in this order:
# "Hello"
# "I am strange,"
# "but its cool right?"
# "Aaaaaand its over."

regex

Regular Expressions are used to match pattern against a string. It can be used for example to validate user data submited by a form, or to extract specific data from a text etc...

Create regex: in Ruby we usually create regex using /.../. But you can also use %r{...} or call new on Regexp class. Indeed in Ruby, regex belong to a specific Regexp class.

Rubular is the refering tool to create/test Ruby regex (+ has a cheatsheet)

Warning: the following explanations maybe specific to Ruby. Indeed Ruby uses its regex engine, other languages use other regex engines, so regex interpretation can be different

Quantifiers: are applied to the previous character, and helps identify how many of this character we are expecting:

  • /x?/ means "0 or 1 x"
  • /x*/ means "0 or more x"
  • /x+/ means "1 or more x"
  • /x{3}/ means "3 x"
  • /x{3,}/ means "3 or more x"
  • /x{3,6}/ means "between 3 and 6 x"

Group: /(abc)/ will group those 3 characters together

Or: /a|b/ means a or b

Intervals:

  • /./ means "any single character" (but not a newline)
  • /[aB9]/ means "a or B or 9" (whereas ab just means "a followed by b")
  • /[0-9]/ means "any number between 0 and 9"
  • /[a-r]/ means "any letter between a and r
  • /[a-zA-Z]/ means "any letter between a and z or A and Z
  • /[^abc]/ means "any letter EXCEPT a, b or c" (need to be between [], otherwise it means something else)

Special characters: to match a character already has a specific meaning in regex (ex: +), you need to put a \ before it. ex: /1\+1/

Shortcuts:

  • /\d/ means "any digits" ([0-9])
  • /\D/ means "any non digit ([^0-9])
  • /\w/ means "any word character" (letter, number or underscore: [a-zA-Z0-9_])
  • /\w/ means "any non word character" ([^a-zA-Z0-9_]
  • /\s/ means "any whitespace character" (spaces, tab, line break etc...)
  • Ruby also suppports some bracket expressions, like /[[:upper:]]/ which means "any uppercase character" (check ruby doc for more)

Anchors:

  • \A means that my regex starts after it. Ex: 'renodor' will match /\Arenodor/ but not /\Aenodor/ (whereas it will match /enodor/)
  • \z means that my regex ends before it. Ex: 'renodor' will match /renodor\z/ but not /renodo\z/ (whereas it will match /renodo/)
  • ^ means that my line starts after it, but not necessarily my regex. It is like a lighter version of \A: I can start a regex, break line, and then call ^ on the beginning of my second line, and it will match (whereas \A won't because it is not the beginning of my regex)
  • $ means that my line ends before it, but not necessarily my regex. It is like a lighter version of \z
  • \b means a word bondary (beginning, or ending of a word). Ex: 'renodor' will match /\brenodor\b/ but not renodor123 (whereas it will match /renodor/)

Modifiers: are placed right after the regex, ex: /renodor/i

  • i make the regex case insensitive
  • m allows the dot . (that means by default any character except the newline) to match also newlines: "a\nb" will match /a.b/m (whereas /a.b/ won't)

Common Ruby Regex methods:

  • regex =~ string returns the position of the regex, or nil if not found
  • regex.match?(string) returns true if string match regex, false if not

So we can use those two methods to build conditions:

if /b/ =~ 'abcd'
  p 'Yeah!'
else
  p 'oh no...'
end

if /b/.match?('abcd')
  p 'Yeah!'
else
  p 'oh no...'
end

(The order doesn't change anything, string.match?(regex) or string =~ regex works the same

Match Data: string.match(regex) returns a MatchData object, or nil if there is no match. This object contains the different matches it founds, so that you can play with. It is especially usefull when you use groups in your regex:

# I create a Regex looking for 'abc', and putting it in three different groups
regex = /(a)(b)(c)/

match_data_result = regex.match('abc')
# Now my variable match_data_result contains a MatchData object that looks like that:
# <MatchData "abc" 1:"a" 2:"b" 3:"c">

# So I can get the different regex groups:
match_data_result[0] # will return "abc" (the whole match)
match_data_result[1] # will return "a" (the first group)
match_data_result[2] # will return "b" (the second group)
match_data_result[3] # will return "c" (the third group)

You can even name your regex groups and then call them by their names:

# I create a Regex looking for:
# - one or more of any word characters (and I name this group "first_name")
# - a ":"
# - one or more of any word characters (and I name this group "last_name")
regex = /(?<first_name>\w+):(?<last_name>\w+)/

# I create my MatchData object
match_data_result = regex.match('Bruce:Wayne')

# I can now call the different groups by their name on my MatchData:
match_data_result[:first_name] # will return "Bruce" (match_data_result['first_name'] works too)
match_data_result[:last_name] # will return "Wayne" (and match_data_result['last_name'] works too)

Scan: scan is similar to match but instead of returning only the first match it will scan the whole string and return all matches in an array. If you create regex groups, it will return nested arrays:

'abab'.match(/a|b/) # will return a MatchData with only one match: "a"
'abab'.scan(/a|b/) # will return an array with: ["a", "b", "a", "b"]
'abab'.scan(/(a|b)/) # will return an array with: [["a"], ["b"], ["a"], ["b"]]

Gsub: .gsub is called on a string and allows you to replace part of this string. It is often used with regex:

'renodor'.gsub('o', '*') # will return 'ren*d*r'
'renodor'.gsub(/o|e/, '*') # will return 'r*n*d*r'

# You can also capture groups in your regex and reuse it:
'renodor'.gsub(/(o|e)/, '~\1~') # will return 'r~e~n~o~d~o~r':
# Indeed, I created a regex that look for 'e' or 'o' and put it in a group
# In the second argument of gsub, \1 refers to my group.
# so '~\1~' means "when you find my group, put a ~ before and after it"

# You can also use named groups, and then call them with the \k<group_name> notation:
'renodor'.gsub(/(?<my_group>o|e)/, '-\k<my_group>-') # will return 'r-e-n-o-d-o-r'

parsing and storing data

Parse data = get data from another language and adapt it to the language you want to use

To manipulate CSV data in Ruby we use the CSV class that already includes a lot of built-in methods. You need to require 'csv', and then you can use it

Parse CSV:

require 'csv'

file_path = 'path/to/file.csv'

CSV.foreach(file_path) do |row|
  p row[0]
end
# For each row of the file, it will output the value of the first column

CSV.foreach(file_path) do |row|
  p "#{row[0]} | #{row[1]} | #{row[2]}"
end
# For each row of the file, it will output the value of the first 3 columns separated by pipes

You can use csv options to better handle the parsing. Csv options is a Hash were can define different parameters. Check here to see all available options. Ex:

require 'csv'

file_path = 'path/to/file.csv'
csv_options = { col_sep: ',', quote_char: '"', headers: :first_row }
# We precise that it's a comma separated csv, strings are double quoted, and the first row is the header
# (so that the header won't be returned when using csv.foreach)

CSV.foreach(file_path, csv_options) do |row|
  p row['column_name'] # and now we can use the column name instead of its index
end

Store CSV:

require 'csv'

file_path = 'path/to/file.csv'
csv_options = { col_sep: ',', force_quotes: true, quote_char: '"' }

CSV.open(file_path, 'wb', csv_options) do |csv|
  csv << ['Name', 'Type', 'Origin']
  csv << ['Guinness', 'Stout', 'Ireland']
  # etc... line per line
end

# 'wb' means write. So the whole content of the csv file will be re-written by the method.
# you can also use 'a' instead, that means 'append', so the method will add the content at the end of the csv file:

CSV.open(file_path, 'a', csv_options) do |csv|
  # code
end

JSON: JavaScript Object Notation is a human-readable format used to store and transport data. It is syntatically identical to JavaScript objects, and very similar to Ruby Hashes.

To manipulate JSON data in Ruby we use the JSON class that already includes a lot of built-in methods. You need to require 'csv', and then you can use it:

Parse JSON:

require 'json'

file_path = 'path/to/file.json'
serialized_file = File.read(file_path) # We use the File class to read the JSON file and transform it to a string
JSON.parse(serialized_file) # Returns a Hash containing the JSON file data

Parse JSON from the web with an API: a simplified explanation of what is an API (Application Programming Interface) would be: it is the interface a program provides so that you can interact with its data. Nowadays, the format norm to exchange data is JSON. So most APIs use JSON to send or receive data. (Ex: on Swapi API you can get data about star wars movies)

require 'json'
require 'open-uri' # allow to open URLs in ruby

url = 'https://my-api-endpoint'
serialized_data = open(url).read
JSON.parse(serialized_data) # Returns a Hash containing the JSON data from the url

Store JSON:

require 'json'

# We create a Hash with data we want to store:
data = { 'name': 'Batman', 'identity': 'Bruce Wayne', 'power': 'Rich' }

file_path = 'path/to/file.json'
File.open(file_path, 'wb') do |file|
  file.write(JSON.generate(data))
end

Scraping: when you have no API, you need to get what you are looking for directly on the HTML... We say that you scrape the HTML

To work with HTML (and XML) in Ruby we use nokogiri gem

require 'open-uri'
require 'nokogiri'

url = "https://www.url-to-scrap"

html_file = open(url).read
html_doc = Nokogiri::HTML(html_file) # will create a "nokogiri" document you can manipulate (and call .search on it)

css_selector = '.class-name' # this is the CSS selector you will use on the HTML document to find the specific elements you are looking for
html_doc.search(css_selector).each do |element|
  puts element
end

object oriented programming

Object-oriented programming (OOP) is a programming paradigm based on the concept of objects, which can contain data (attributes, properties), and behavior (methods). Ruby is an OOP language, in Ruby everything is Object: (almost) all data types are actually classes:

  • When you declare a string: "Hello" you are actually creating an instance of the class String. It is the same as doing String.new("Hello") >> it is the data (or state)
  • Then if you do "Hello".upcase it will return "HELLO" >> you modified the behavior of your instance

Classes allows you to create objects. An object created thanks to a class is called an instance of this class:

  • A class is like a cake mold: it helps to create a cake, it defines the shape and behaviors of your objects (its data and behavior)
  • An instance of a class is like a cake that you created thanks to your class (the cake mold)

classes

In Ruby you create a class with the kw class. Class names are in uppercase CamelCase (ex: SportCar)

class Car
  # some code
end

By convention, there is always only 1 class per file, and file names are in lower_snake_case

When you have a class you can create instances using the new method (it is called "intantiate"):

nice_car = Car.new
cool_car = Car.new

When you instantiate a class, it will automatically call its initialize method:

class Car
  def initialize
  end
end

In this initialize method you can start to define data. Ex:

class Car
  def initialize(color)
    @engine_started = false
    @color = color
  end
end

green_car = Car.new('green')

By doing so we created an instance of the class Car that is green, and with its engine not started

@color and @engine_started are instance variable:

  • Instance variable start with an @
  • They are available accross the whole class code
  • They store the state (data) of one specific instance (@color can be "green" for one instance and "orange" for another instance)

We can then add behavior to our class by creating an instance methods:

class Car
  def initialize(color)
    @engine_started = false
    @color = color
  end

  def engine_started?
    @engine_started
  end

  def starts_engine
    @engine_started = true
  end
end

green_car = Car.new('green')
green_car.engine_started? # returns false
green_car.starts_engine # now @engine_started = true

We now have two instance methods: one that tells us if engine is started or not, and another one that starts it.

  • Instance methods allow you to access or define the behavior of a specific instance
  • Instance methods can be called only on instances, not on classes (you can't do Car.starts_engine. What car you want to start the enngine to??)

If you want to access the state of your instances (the color), you need make it accessible. You can create a "getter" method (like in JS):

class Car
  def initialize(color)
    @color = color
  end

  def color
    @color
  end
end

green_car = Car.new('green')
green_car.color # returns 'green'

But Ruby provides a shortcut for that: attr_reader

class Car
  attr_reader: color, brand

  def initialize(color, brand)
    @color = color
    @brand = brand
  end
end

green_car = Car.new('green', 'cool brand')
green_car.color # returns 'green'
green_car.brand # returns 'cool brand'

Same you want to change the state of your instance, we could create a "setter" method (like in JS):

class Car
  attr_reader :color, :brand

  def initialize(color, brand)
    @color = color
    @brand = brand
  end

  def paint=(new_color)
    @color = new_color
  end
end

green_car = Car.new('green', 'cool brand')
green_car.paint = 'orange' # we change the color of the car
green_car.color # will now return 'orange

But Ruby provides a shortcut for that: attr_writer

class Car
  attr_reader :color, :brand
  attr_writer :color

  def initialize(color, brand)
    @color = color
    @brand = brand
  end
end

green_car = Car.new('green', 'cool brand')
green_car.color = 'orange' # we change the color of the car
green_car.color # will now return 'orange

Now if an attribute needs to be read AND writen, there is another shortcut: attr_accessor

class Car
  attr_reader :brand
  attr_accessor :color

  def initialize(color, brand)
    @color = color
    @brand = brand
  end
end

By default in Ruby all methods are public (they are accessible by everyone), but you can put it private by writting it below the private keyword, it won't be accessible by the user. They can only be called from within the class itself

class Car
  def initialize
  end

  private

  def secret_fuel
    'fusion'
  end
end

car = Car.new
car.secret_fuel # returns a NoMethodError: private method `secret_fuel' called

Tips: you can require files in IRB (ex: require_relative 'car.rb') to test your classes. And then you can reload it if it has changed: load('car.rb')

Inheritance: classes can inherit from a parent class and thus inherit all its instances variables and instances methods

# Parent class
class Building
  attr_reader :name, :width, :length

  def initialize(name, width, length)
    @name = name
    @width, @length = width, length
  end

  def floor_area
    @width * @length
  end
end

# Child class that inherits from parent class
class Castle < Building
end

# The child class has access to all parent class instance methods and instance variables
castle = Castle.new("Versailles", 32, 35)
castle.name # => "Versailles"
castle.floor_area # => 1120

It does not prevent you from creating specific instance methods and instances variables on the child class. And you can also redefine instance methods of the parent class in the child class (by creating a method with the same name on the child class, it will take precedent over the one on the parent class).

The super keyword: placed in a child instance method, super is a keyword that calls the parent's method with the same name

class Castle < Building
  # A castle always has a garden of 300 sq. m
  def floor_area
    super + 300  # `super` calls `floor_area` of `Building`
  end
end

Tips: if you call superclass on a class it returns the class it inherit from: Castle.superclass returns Building

Class methods: are methods that you call directly on the class itself. Ex: Time.now >> Time is a class (not an instance), and now a class method. To define a class method you have to use the self keyword:

class Castle < Building
  def self.categories
    ["Medieval", "Norman", "Ancient"]
  end
end

# You can now call categories directly on Castle class
Castle.categories.join(', ') # => Medieval, Norman, Ancient

self actually refers to "itself". It can also for example be called inside an instance method to refers the instance itself

class Skyscraper < Building
  def initialize(name, width, length, height)
    super(name, width, length)
    @height=height
  end

  def type_of_owner
    if @height> 50
      "this #{self.capitalized_name} is a skyscraper for Spider-Man."
    else
      "this #{self.capitalized_name} is a skyscraper for beginners"
    end
  end

  def capitalized_name
    @name.capitalize
  end
end

skyscraper = Skyscraper.new("empire State Building", 30, 60, 381)
skyscraper.type_of_owner # => "This Empire State Building is a skyscrapper for Spider-Man."

(self is actually not mandatory in this case, if you just call capitalized_name it will work the same as self is implicite... But you got the idea)

Tips: to make all attributes of a class optional and unordered you can initialize it with an empty hash:

class Robot
  attr_reader :name, :type

  def initialize(attributes = {})
    @name = attributes[:name]
    @type = attributes[:type]
    @artificial_intelligence_level = attributes[:ai]
  end
end

# All the following examples are valid
Robot.new(name: 'R2D2')
Robot.new(name: 'R2D2', type: 'Astromech Droid')
Robot.new(type: 'Astromech Droid', name: 'R2D2')
Robot.new(ai: 173, type: 'Astromech Droid')
Robot.new(name: 'R2D2', type: 'Astromech Droid', ai: 173)