Sunday, June 26, 2016

Ruby Classes

Introduction to classes in Ruby

Classes are at the heart of object-oriented programming (OOP). A class is a way of grouping related variables and functions into a single container. This container is called an "object". Put another way, a class is like a blueprint from which objects are created.

Let's use a car blueprint as an example and define a simple class:


class Car
attr_accessor :color
end

In the above example:
  • class is a keyword used to create classes;
  • Car is the name of the class;
  • attr_accessor is an attribute accessor, which is a method that allows us to read and write instance variables from outside the class (this will be explained later);
  • :color is an attribute, which is a component of a class (such as a variable) which can be accessed from the outside;
  • end ends the class declaration

Instances (objects)


Generally, in order to use a class, we need to create an instance. Continuing the car example, a class is a blueprint and an instance is an actual car. In other words, an instance is an object built from the blueprint provided by a class. The words instance and object are used interchangeably.

The syntax for instantiating a class is:


instance_name = ClassName.new 

Let's try it with the class created above:


# Create an instance of the Car class 
civic = Car.new     
 
# Set a value for the color attribute 
civic.color = "silver" 
 
# Read the value of the color attribute 
civic.color        # Output: => "silver" 

We could create any number of instances of the Car class and store them in a collection such as an array or a hash. Each instance could have a different value for the "color" attribute.

We named the object civic but we could have used any other name. Convention dictates that class names should be written in CamelCase and object names in snake_case.

Reflection: examining objects


Reflection, also known as introspection, is the ability to programmatically examine (inspect) classes, objects, methods, etc. In other words, the application provides information about its own code and state. That is useful in the context of metaprogramming, debugging and understanding code written by other people.

Inspecting objects

The inspect method returns a human-readable string representation of any object, including the class name, a hexadecimal representation of the object ID, and the names and values of the object's variables (instance variables).


class Meditation 
  def initialize 
    @name = "zazen" 
    @minutes = 40 
  end  
end 
 
m = Meditation.new 
 
m.inspect  # Output: => "#<Meditation:0x00000001c87e08 @name=\"zazen\", @minutes=40>" 

When writing a class, we can override the inspect method to provide more useful information about its objects.

Testing if an object is an instance of a specific class

There are several ways to check whether an object is an instance of a particular class.


class Meditation 
end 
 
zazen = Meditation.new 
 
Meditation === zazen  # Output: => true 
zazen.is_a? Meditation  # Output: => true 
zazen.instance_of? Meditation # Output: => true 

Both === and is_a? methods return true if the object is an instance of the given class or any of its ancestors. The instance_of? method is stricter and only returns true if the object is an instance of that specific class, not an ancestor. The term "ancestor" is related to class inheritance and will be explained soon.

There is also the kind_of method, which is just a different name for is_a?.


Kernel.instance_method(:kind_of?) == Kernel.instance_method(:is_a?) # Output: => true 

In the following example, we create one more instance, then get a list of all instances of the of the Meditation class.


kinhin = Meditation.new 

ObjectSpace.each_object(Meditation) { |x| puts x } 
Output:

#<Meditation:0x00000001dd85a0> 
#<Meditation:0x00000001de1ba0> 

The above output is a string representation of the two instances of the Meditation class (zazen and kinhin). We could use the above code, for instance, to read or write an attribute in all instances of a class.

Classes are also objects


Almost everything is an object in Ruby, even classes. All classes are instances of a built-in class called Class.


class Foo 
end 
 
# The newly created Foo class is an instance of the built-in Class class. 
Foo.instance_of? Class    # Output: => true 
 
# Instances of the Foo class are, well, just instances of the Foo class…  
f = Foo.new 
f.instance_of? Foo    # Output: => true 
f.instance_of? Class    # Output: => false 

That may be easier to grasp when represented visually.


Class (built-in class) 
   | 
   |--- Foo (instance of Class) 
         | 
         |--- f (instance of Foo) 

When starting to learn OOP, it's easy to confuse instantiation with inheritance (which will be discussed later). However, they are completely different things. In the above example, the Foo class is an instance of the built-in Class named class. Foo is not a subclass of Class; there is no inheritance relationship between them.

Ruby's built-in classes are also instances of the Class class.


String.instance_of? Class    # Output: => true 
Integer.instance_of? Class    # Output: => true 

The initialize method


Every time an object is created (a class is instantiated), Ruby looks for a special method called initialize within the class. If it's there, it's automatically executed. Defining the initialize method is optional; if it's not defined, nothing happens.

The initialize method is often used to set default values to instance variables. In the example below, all new instances of the Car class are created with "black" as the default value for the color instance variable.


class Car 
  attr_accessor :color 
 
  def initialize 
    @color = "black" 
  end 
end 
 
c = Car.new 
c.color    # Output: => "black" 

We can also add parameters when defining the initialize method. When instantiating a class, any arguments passed to the new method are received by the initialize method.


class Car 
  attr_accessor :color 
 
  def initialize(color) 
    @color = color 
  end 
end 
 
c = Car.new  # Output: ArgumentError: wrong number of arguments (given 0, expected 1) 
 
# The "silver" argument passed here will be received by the initialize method 
c = Car.new "silver"   
c.color  # Output: => "silver" 

initialize handles arguments like any other method. It can have positional parameters (required and optional), a single splat parameter, keyword parameters (required and optional) and a double splat parameter. It may also receive a block implicitly or have an explicit block parameter (prefixed with &). This post about methods explains how each type of parameter/argument works.

Note that Ruby implicitly makes initialize a private method and silently discards its return value.

Attributes and accessor methods


Attributes are class components that can be accessed from outside the object. They are known as properties in many other programming languages. Their values are accessible by using the "dot notation", as in object_name.attribute_name. Unlike Python and a few other languages, Ruby does not allow instance variables to be accessed directly from outside the object.


class Car 
  def initialize 
    @wheels = 4  # This is an instance variable 
  end 
end 
 
c = Car.new 
c.wheels     # Output: NoMethodError: undefined method `wheels' for #<Car:0x00000000d43500> 

In the above example, c is an instance (object) of the Car class. We tried unsuccessfully to read the value of the wheels instance variable from outside the object. What happened is that Ruby attempted to call a method named wheels within the c object, but no such method was defined. In short, object_name.attribute_name tries to call a method named attribute_name within the object. To access the value of the wheels variable from the outside, we need to implement an instance method by that name, which will return the value of that variable when called. That's called an accessor method. In the general programming context, the usual way to access an instance variable from outside the object is to implement accessor methods, also known as getter and setter methods. A getter allows the value of a variable defined within a class to be read from the outside and a setter allows it to be written from the outside.

In the following example, we have added getter and setter methods to the Car class to access the wheels variable from outside the object. This is not the "Ruby way" of defining getters and setters; it serves only to illustrate what getter and setter methods do.


class Car 
  def wheels  # getter method 
    @wheels 
  end 
 
  def wheels=(val)  # setter method 
    @wheels = val 
  end 
end 
 
f = Car.new 
f.wheels = 4  # The setter method was invoked 
f.wheels  # The getter method was invoked 
# Output: => 4 

The above example works and similar code is commonly used to create getter and setter methods in other languages. However, Ruby provides a simpler way to do this: three built-in methods called attr_reader, attr_writer and attr_acessor. The attr_reader method makes an instance variable readable from the outside, attr_writer makes it writeable, and attr_acessor makes it readable and writeable.

The above example can be rewritten like this.


class Car 
  attr_accessor :wheels 
end 
 
f = Car.new 
f.wheels = 4 
f.wheels  # Output: => 4 

In the above example, the wheels attribute will be readable and writable from outside the object. If instead of attr_accessor, we used attr_reader, it would be read-only. If we used attr_writer, it would be write-only. Those three methods are not getters and setters in themselves but, when called, they create getter and setter methods for us. They are methods that dynamically (programmatically) generate other methods; that's called metaprogramming.

The first (longer) example, which does not employ Ruby's built-in methods, should only be used when additional code is required in the getter and setter methods. For instance, a setter method may need to validate data or do some calculation before assigning a value to an instance variable.

It is possible to access (read and write) instance variables from outside the object, by using the instance_variable_get and instance_variable_set built-in methods. However, this is rarely justifiable and usually a bad idea, as bypassing encapsulation tends to wreak all sorts of havoc.

Inheritance


Trough inheritance, a class acquires (inherits) components from another class.

A class that inherits from another class is called subclass, also known as child class or derived class. The class that is inherited (where the inherited components are implemented) is called superclass or parent class. We will use these terms interchangeably throughout this post. You will also see the terms ancestor and descendant. Ancestors are all classes above a specific class in its inheritance hierarchy; descendants are all classes below it.

Usually, the superclass (parent) is more general and its subclasses (children) add further specialization. For instance, a class called Car may specify that cars have 4 wheels, a steering wheel and so on. This class may inherit from a class called Vehicle that implements the details of combustion engines and will also be inherited by the Motorcycle class. Another example is a Polygon class which contains common characteristics of all polygons and is inherited by other classes named Square and Triangle.

Some programming languages such as C++, Perl, and Python allow one class to inherit from multiple other classes; that is called multiple inheritance. Ruby does not support multiple inheritance. That means each class can only inherit from one other class. However, many classes can inherit from the same class.

Again, beware not to confuse inheritance with instantiating, as they are completely different things.

Method overriding


Method overriding allows a subclass to provide its own implementation of an inherited method. When there are two methods with the same name, one in the superclass and another on the subclass, the implementation of the subclass will override the one from the superclass. That happens only within the subclass, the original method implementation within the superclass is not affected.


class A 
  def meditate 
    puts "Practicing zazen…" 
  end 
end 
 
class B < A 
  def meditate 
    puts "Practicing kinhin…" 
  end 
end 
 
b = B.new 
b.meditate     # Output: Practicing kinhin… 

The super keyword


As seen above, if both superclass and subclass have methods of the same name, the implementation of the subclass will prevail (inside the subclass). However, instead of overriding the implementation of the superclass, we might need to add extra functionality. Using the super keyword within the subclass allows us to do that; super calls the superclass implementation of the corresponding method. In other words, it allows the overriding method to call the overridden method.


class Zazen 
  def meditate 
    puts "Practicing Zazen…" 
  end 
end 
 
class Sitting < Zazen 
  def meditate 
    puts "Sitting…" 
    super # Calls the meditate method implemented in the parent class 
    puts "Getting up…" 
  end 
end 
 
s = Sitting.new 
s.meditate 
Output:

Sitting… 
Practicing Zazen… 
Getting up… 

Notice how, in the example above, the statements from both meditate methods (implemented in both classes) were executed.

How super handles arguments

Regarding argument handling, the super keyword can behave in three ways:

When called with no arguments, super automatically passes any arguments received by the method from which it's called (at the subclass) to the corresponding method in the superclass.


class A 
  def some_method(*args) 
    puts "Received arguments: #{args}" 
  end 
end 
 
class B < A 
  def some_method(*args) 
    super 
  end 
end 
 
b = B.new 
b.some_method("foo", "bar")     # Output: Received arguments: ["foo", "bar"] 

If called with empty parentheses (empty argument list), no arguments are passed to the corresponding method in the superclass, regardless of whether the method from which super was called (on the subclass) has received any arguments.


class A 
  def some_method(*args) 
    puts "Received arguments: #{args}" 
  end 
end 
 
class B < A 
  def some_method(*args) 
    super()  # Notice the empty parentheses here 
  end 
end 
 
b = B.new 
b.some_method("foo", "bar")     # Output: Received arguments: [ ] 

When called with an explicit argument list, it sends those arguments to the corresponding method in the superclass, regardless of whether the method from which super was called (on the subclass) has received any arguments.


class A 
  def some_method(*args) 
    puts "Received arguments: #{args}" 
  end 
end 
 
class B < A 
  def some_method(*args) 
    super("baz", "qux")  # Notice that specific arguments were passed here 
  end 
end 
 
b = B.new 
b.some_method("foo", "bar")     # Output: Received arguments: ["baz", "qux"] 

Reflection: examining the inheritance hierarchy of a class


Ruby provides reflective methods which return information about a class's inheritance chain.


class AsianReligion 
end 
 
class Buddhism < AsianReligion 
end 
 
class Zen < Buddhism 
end 

Let's suppose we need to identify the relationship between the above classes (regarding inheritance), while being unable to look at the above code.

Check if the Zen class is a descendant of the Buddism and AsianReligion classes:


Zen < Buddhism  # Output: => true 
Zen < AsianReligion  # Output: => true 

Identify the superclass of the Zen class:


Zen.superclass  # Output: => Buddhism 

Get a list of all ancestors of the Zen class. The ancestors method returns the whole inheritance hierarchy (Zen and all classes above it) and all the modules included in these classes. In the following example, we exclude any modules, leaving only ancestor classes.


Zen.ancestors - Zen.included_modules  # Output => [Zen, Buddhism, AsianReligion, Object, BasicObject] 

Notice how the three classes defined above are included, along with a couple of Ruby's built-in classes called Object and BasicObject. All classes inherit implicitly from these two built-in classes. However, explaining the Ruby Core Object Model is outside the scope of this post.

Rails provides the descendants and subclasses methods to list a class descendants and direct subclasses. Ruby does not provide a built-in method to do that. We can, however, use the following code; it returns the names of all descendants of the AsianReligion class (and the class itself).


ObjectSpace.each_object(AsianReligion.singleton_class).to_a 
# Output: => [Buddhism, AsianReligion, Zen] 

Polymorphism


Briefly put, polymorphism is to call the same method in different objects and get different results. We are actually calling different implementations of the method (entirely different methods with the same name). Hence, the different results.

There are three types of polymorphism: inheritance polymorphism, interface polymorphism, and abstract polymorphism. This post covers the first two but not the third, as it is usually implemented by using abstract classes, which are not supported by Ruby.

Inheritance Polymorphism


In Ruby, polymorphism is usually implemented through inheritance, as in the example below. Remember that if both child and parent classes define methods with the same name, the implementation of the subclass prevails, and the one from the superclass is overridden.


class Mammal 
  @@vertebrate = true 
  @@endothermic = true 
  @@fur = true 
 
  def make_sound 
    raise NotImplementedError, "The make_sound method should be implemented in the subclass." 
  end 
end 
 
class Cat < Mammal
end 
 
class Dog < Mammal
  def make_sound 
    puts "Woof" 
  end 
end 
 
c = Cat.new  
c.make_sound  # Output: NotImplementedError: The make_sound method should be implemented in the subclass. 
 
d = Dog.new 
d.make_sound  # Output: Woof 

In the example above, notice how both Cat and Dog classes are subclasses of Animal. To understand polymorphism, just look at the make_sound method implemented at the Animal class; it does nothing except making sure that all subclasses of Animal implement their own make_sound method.

Interface Polymorphism


Different methods with the same name are implemented in distinct classes and do different things. An example is the + method. Remember that, in Ruby, lots of operators such as + are implemented as methods. That means 2 + 3 is syntactic sugar (a convenient shortcut) for 2.+(3).

Different classes provide distinct implementations of the + method. When called on a string object, it will concatenate two operands:


"foo" + "bar"  # Output: => "foobar" 

When called on a float, it will sum two operands:


1.0 + 1.0  # Output: => 2.0 

When called on an array, it will merge two operands into a single new array.


[ "foo", "bar" ] + [ "baz" ]  # Output => ["foo", "bar", "baz"] 

Each one of the three classes (String, Float, and Array) provides its own implementation of the + method. That means, the appropriate method (implementation) is always called, depending on the context. That is an example of interface polymorphism.

Duck Typing


Duck typed objects are defined by what they do, instead of their type. In other words, instead of requiring an object to be an instance of a particular class, we require it to respond to one or more specific methods. The term duck typing comes from the saying "if the object walks like a duck and quacks like a duck, then it must be a duck".

The + method can also be used as an example of duck typing. In the following example, we created a method called sum, which takes two arguments; regardless of the types of the objects passed as arguments, it expects them to respond to the + method.


def sum(a, b) 
  a + b 
end 
 
# Integers, strings and arrays respond to the + method as expected 
sum(1,1)  # Output:  => 2 
sum("foo", "bar")  # Output: => "foobar" 
sum([1,2,3], [4,5])  # Output: => [1, 2, 3, 4, 5] 
 
# Hashes and ranges do not respond to the + method 
sum({a:1}, {b:2})  # Output: NoMethodError: undefined method `+' for {:a=>1}:Hash 
sum(0..1, 2..3)  # Output: NoMethodError: undefined method `+' for 0..1:Range 

Usually, we only check whether the object implements a method by a specific name. However, the method may exist but return something unexpected. That can be avoided by thoroughly testing our code, and most good developers will do just that.

Variable types


Ruby provides five types of variables: global, instance, class, local and constant. This post covers the last four plus class instance variables, which are a particular type of instance variable. Global variables are not covered because, in the vast majority of cases, using them is a bad practice.

Instance variables


Instance variables are defined within instance methods, and their names begin with @. Their value is only accessible within the specific object on which it was set. In other words, when we modify the value of an instance variable, the change only applies to that particular instance. Unlike local variables which are only available within the method where they were defined, instance variables are accessible by all methods within the object (instance methods of the class). Instance variables are the most commonly used type of variable in Ruby classes.


class Car 
  attr_reader :color 
 
  def set_color(color_receiverd_as_argument) 
    @color = color_receiverd_as_argument 
  end 
end 
 
car1 = Car.new  
car1.color     # Output: => nil   
car1.set_color "black" 
car1.color     # Output: => "black" 
 
car2 = Car.new 
car2.set_color "silver" 
car2.color    # Output: => "silver" 

In the example above, notice that:
  • Trying to access an instance variable before it's initialized will not raise an exception. Its default value is nil.
  • Changing the value of the color variable in one instance of the Car class does not affect the value of the same variable in the other instances.
To get a list of all instance variables of a class, use the instance_variables method, which returns an array of instance variable names.


class Car 
  def initialize 
    @wheels = 4 
  end 
end 
 
c = Car.new 
c.instance_variables  # Output: => [:@wheels]  

Note that only initialized instance variables (those who were already given a value) are shown by the instance_variables method.

As seen earlier in this post, instance variables need accessor methods to be read and written from outside the object.

Class variables


Class variables are defined at the class level, outside any methods. Their names begin with @@, and their values can be read or written from within the class itself or any of its subclasses and instances. Class variables can be accessed by both class methods and instance methods (explained further below).


class Car 
  @@count = 0        # This is a class variable 
 
  def initialize 
    @@count += 1    # Increment the count each time the class is instantiated 
    puts @@count 
  end 
 
  # This is a getter method, used to read the @@count class variable from outside 
  def count 
    @@count 
  end 
end 
 
# Create 3 instances of the Car class 
car1 = Car.new 
car2 = Car.new 
car3 = Car.new 
 
car1.count    # Output: => 3 
car2.count    # Output: => 3 
car3.count    # Output: => 3 

In the example above, the @@count class variable is initialized (given a value) at the class level. Then, each time the Car class is instantiated, the initialize method is executed, and the value of @@count is incremented by 1. The count method is a getter, required to access the @@count class variable from outside the class. Note that accessor methods (explained above) such as attr_access, attr_read, and attr_write do not work with class variables. Rails provides accessor methods that work with class variables, named cattr_accessor, cattr_reader and cattr_writer.

Notice how @@count is accessible inside the initialize and count instance methods. Also, its value persists between all instances of the Car class.

Any changes in a class variable value will reflect on all of its instances and subclasses. Whether the value is changed in the class where the variable was defined or any of its descendants, it changes throughout the whole hierarchy.

Let's continue the example above:


class Sedan < Car 
  def mess_up_count 
    @@count = 345 
  end 
end 
 
s = Sedan.new 
s.count        # Output: => 4 
 
s.mess_up_count 
s.count        # Output: => 345 
car3.count    # Output: => 345 

In the example above, the Sedan subclass class inherited @@count and its value from Car. Then, we called the mess_up_count method, which changed the value of @@count to 345. Notice how the value of @@count in the car3 object (instance of the Car class) was also changed. This often causes undesired effects, and it's the reason why class variables are not often used in Ruby.

The class_variables method returns an array containing the names of all class variables in a specific class. It includes inherited class variables, as well as those defined within the class. If used with the false flag, like Car.class_variables(false), it omits inherited class variables.


Car.class_variables  # Output: => [:@@count] 

It is possible to access (read and write) class variables from outside the class, by using the class_variable_get and class_variable_set built-in methods. That's included in this post for the sake of completeness, but it's usually a terrible practice as it breaks encapsulation.

Class instance variables


Class instance variable names also begin with @. However, they are defined at class level, outside any methods. Class instance variables can only be accessed by class methods. They are shared amongst all instances of a class but not its subclasses. In other words, they are not inheritable. If the value of a class instance variable is changed in one instance of the class, all other instances are affected. Earlier we saw how all classes are instances of a built-in class called Class. That is what makes class instance variables possible.


class Vehicle 
  @count = 0        # This is a class instance variable 
 
  def initialize 
    self.class.increment_count 
    self.class.show_count 
  end 
 
  def self.increment_count    # This is a class method 
    @count += 1 
  end 
 
  def self.show_count        # This is a class method 
    puts @count 
  end 
 
end 
 
class Car < Vehicle 
  @count = 0 
end 
 
v1 = Vehicle.new    # Output: 1 
v2 = Vehicle.new    # Output: 2 
v3 = Vehicle.new    # Output: 3 
 
car1 = Car.new        # Output: 1 
car2 = Car.new        # Output: 2 
 
v3 = Vehicle.new    # Output: 4 

Let's review the example above. A class instance variable called @count is set in the Vehicle class, with an initial value of 0. Every time the Vehicle class is instantiated, the initialize method calls self.increment_count to increment the value of @count and self.show_count to return the new value. Then, we have the Car class, which is a subclass of Vehicle and inherits all of its methods. However, it does not inherit the @count class instance variable, as this type of variable is not inheritable. That's why the counter works within the Car class, but it has its own count.

Methods prefixed with self., such as self.increment_count and self.show_count, are class methods. That is the only kind of method capable of accessing class instance variables. We will get back to class methods soon.

Local variables


A local variable within a class is like any other local variable in Ruby. It is only accessible within the exact scope on which it's created. If defined within a method, it is only available inside that method.


class Car  
  def initialize 
    wheels = 4 
  end 
   
  def print_wheels 
    print wheels 
  end 
end 
 
c = Car.new 
c.print_wheels        # Output: NameError: undefined local variable or method `wheels'…     

Constants


Constants are used to store values that should not be changed. Their names must start with an uppercase letter. By convention, most constant names are written in all uppercase letters with an underscore as word separator, such as SOME_CONSTANT.

Constants defined within classes can be accessed by all methods of that class. Those created outside a class can be accessed globally (within any method or class).


class Car  
  WHEELS = 4 
 
  def initialize 
    puts WHEELS 
  end 
end 
 
c = Car.new     # Output: 4 

Note that Ruby does not stop us from changing the value of a constant, it only issues a warning.


SOME_CONSTANT = "foo" 
SOME_CONSTANT = "bar" 
warning: already initialized constant SOME_CONSTANT 
warning: previous definition of SOME_CONSTANT was here 

In Ruby, all class and module names are constants, but convention dictates they should be written in camel case, such as SomeClass.

Constants can be accessed from outside the class, even within another class, by using the :: (double colon) operator. To access the WHEELS constant from outside the Car class, we would use Car::WHEELS. The :: operator allows constants, public instance methods and class methods to be accessed from outside the class or module on which they are defined.

A built-in method called private_constant makes constants private (accessible only within the class on which they were created). The syntax is as follows:


class Car  
  WHEELS = 4 
 
  private_constant:WHEELS 
end 
 
Car::WHEELS    # Output: NameError: private constant Car::WHEELS referenced 

Class methods and instance methods


Instance methods


All methods defined inside a class with the def method_name syntax are instance methods. They are the most common type of method seen in Ruby code.


class Koan 
  def say_koan 
    puts "What is your original face before you were born?" 
  end 
end 
 
k = Koan.new 
k.say_koan    # Output: What is your original face before you were born? 

The built-in method instance_methods returns an array containing the names of all instance methods of a class. The false flag excludes inherited instance methods.


Koan.instance_methods(false)  # Output: => [:say_koan] 

Class methods


Class methods can be called directly on the class, without instantiating it. Their names are prefixed with self. As seen above, only class methods can access class instance variables.


class Zabuton 
  def self.stuff 
    puts "Stuffing zabuton…" 
  end 
end 
 
# Call the class method without instantiating the class 
Zabuton.stuff  # Output: Stuffing zabuton… 
Zabuton::stuff  # Output: Stuffing zabuton… 
 
# Call the class method through an object 
z = Zabuton.new 
z.class.stuff  # Output: Stuffing zabuton… 

The following syntax can also be used to define class methods and will produce the same result as the above syntax. Remember this example; we will get back to it later to explain the meaning of class << self.


class Zabuton 
  class << self 
    def stuff 
      puts "Stuffing zabuton…" 
    end 
  end 
end 

We can also call a class method from within an instance method, by prefixing it with self.class, as in the following example.


class Zabuton 
  def initialize 
    self.class.stuff  # calling the stuff class method 
  end 
 
  def self.stuff 
    puts "Stuffing zabuton…" 
  end 
end 
 
z = Zabuton.new    # Output: Stuffing zabuton… 

The built-in method called methods returns an array including the names of all class methods of a specific class. If used with the false flag, inherited class methods are omitted.


Zabuton.methods(false)  # Output: => [:stuff] 

Public, Private, and Protected methods


Ruby provides three types of methods: public, private, and protected.

Public Methods


Public methods are most widely used and can be accessed from outside the class.


class Koan 
  def say_koan 
    puts "How do you catch a flying bird without touching it?" 
  end 
end 
 
k = Koan.new 
k.say_koan    # Output: How do you catch a flying bird without touching it? 

In the above example, we were able to call the say_koan method from outside the object because it is a public method. No further explanation is required as all examples in this post (up to this point) are public methods.

To list all public instance methods in a class, use the public_instance_methods built-in method. To list public class methods, use public_methods. As usual, the false flag excludes inherited methods.


Koan.public_instance_methods(false)  # Output: => [:say_koan] 

Private Methods


To define a private method, we use the private keyword, which is actually a built-in method implemented in a class called Module. A private method can only be called by another method within the class on which it was defined (or one of its subclasses).


class Koan 
  def call_say_koan 
    say_koan 
  end 
 
  private 
    def say_koan 
      puts "What is the sound of one hand clapping?" 
    end 
end 
 
k = Koan.new 
k.say_koan    # Output: NoMethodError: private method `say_koan' called for #<Koan:0x000000021e7380> 
k.call_say_koan        # Output: What is the sound of one hand clapping? 

In the above example, we could not call the say_koan private method directly (from outside the class), but we could call the call_say_koan public method which, in turn, called say_koan.

Also in the above example, the built-in private method was used with no arguments. Hence, all methods defined below it were made private.

The private method can also be used with previously defined method names (passed as symbols) as arguments.


class Foo 
  def some_method 
  end 
 
  private :some_method 
end 

In order to make a class method private, use the private_class_method keyword/method instead of private.

Private methods can't be called with a receiver, such as self. Trying to call the say_koan method with self as a receiver (self.say_koan) within call_say_koan would result in the following exception:


NoMethodError: private method `say_koan' called for #<Koan:0x000000021eb548> 

As of Ruby 2.0, the respond_to? method will return false when given a private method as an argument.


k.respond_to? :say_koan  # Output: => false

To list all private instance methods in a class, use the private_instance_methods built-in method. For private class methods, use private_methods.


Koan.private_instance_methods(false)  # Output => [:say_koan] 

Protected Methods


To define a protected method, we use the protected keyword (which is actually a method). Like private methods, protected methods can also be called by other methods within the class on which it was defined (or one of its subclasses). The difference is, protected methods can also be called from within other instances of the same class.

There is no such thing as a protected a class method, Ruby only supports protected instance methods.

Let's suppose we need to select a few meditators to participate in a study. To find the most experienced meditators, we need to compare their total hours of meditation. However, we don't want the number of hours to be visible.


class Meditator 
  def initialize(hours) 
    @hours = hours 
  end 
   
  def more_experienced?(other_person) 
    hours > other_person.hours 
  end 
   
  protected 
    attr_reader :hours  # We have made the accessor protected 
end 
   
m1 = Meditator.new 3000 
m2 = Meditator.new 5000 
 
m2.more_experienced? m1  # Output: => true 
m1.more_experienced? m2  # Output: => false 

Similar code could be used to protect any kind of sensitive data from outside access (outside the class and its instances), although protected methods are not commonly employed in Ruby.

When called with no arguments (as in the above example), the protected method turns all methods defined below it into protected methods. It can also be used to protect previously defined methods, as in the following example.


class Foo 
  def some_method 
  end 
 
  protected :some_method 
end 

To list all protected instance methods in a class, use the protected_instance_methods built-in method. For protected class methods, use protected_methods.


Meditator.protected_instance_methods(false)  # Output: => [:hours] 

The self keyword


The self keyword is always available, and it points to the current object. In Ruby, all method calls consist of a message sent to a receiver. In other words, all methods are invoked on an object. The object on which the method is called is the receiver, and the method is the message. If we call "foo".upcase, the "foo" object is the receiver and upcase is the message. If we don't specify an object (a receiver) when calling a method, it is implicitly called on the self object.

Self keyword at class level


When used within a class but outside any instance methods, self refers to the class itself.


class Foo 
  @@self_at_class_level = self   
 
  def initialize 
    puts "self at class level is #{@@self_at_class_level}" 
  end 
end 
 
f = Foo.new     # Output: self at class level is Foo 

Self keyword at instance methods


When inside an instance method, the self keyword refers to that specific instance. In other words, it refers to the object where it was called.


class Meditation 
  def initialize 
    puts "self within an instance method is #{self}" 
  end 
end 
 
zazen = Meditation.new     # Output: self within an instance method is #<Meditation:0x00000000ab2b38> 

Notice that #<Meditation:0x00000000ab2b38> is a string representation of the zazen object, which is an instance of the Meditation class.

For the next example, we will use a module. Modules will be covered further below but, in the current context, the Meditable module is used as a container for storing methods which will be added to and used by the Sitting class.


module Meditable 
  def meditate 
    "Practicing #{self.meditation_name}…"   
  end 
end 
 
class Sitting 
  include Meditable 
  attr_accessor:meditation_name 
 
  def initialize(meditation_name) 
    @meditation_name = meditation_name 
  end 
end 
 
s = Sitting.new "zazen" 
s.meditate    # Output:  => "Practicing zazen…" 

In the above example, the self keyword refers to the instance of the Sitting class from which the meditate method was called.

Singleton Methods and Metaclasses


All instance methods defined in this post's examples are global methods. That means they are available in all instances of the class on which they were defined. In contrast, a singleton method is implemented on a single object.

There is an apparent contradiction. Ruby stores methods in classes and all methods must be associated with a class. The object on which a singleton method is defined is not a class (it is an instance of a class). If only classes can store methods, how can an object store a singleton method? When a singleton method is created, Ruby automatically creates an anonymous class to store that method. These anonymous classes are called metaclasses, also known as singleton classes or eigenclasses. The singleton method is associated with the metaclass which, in turn, is associated with the object on which the singleton method was defined.

If multiple singleton methods are defined within a single object, they are all stored in the same metaclass.


class Zen 
end 
 
z1 = Zen.new 
z2 = Zen.new 
 
def z1.say_hello  # Notice that the method name is prefixed with the object name 
  puts "Hello!" 
end 
 
z1.say_hello    # Output: Hello! 
z2.say_hello    # Output: NoMethodError: undefined method `say_hello'… 

In the above example, the say_hello method was defined within the z1 instance of the Zen class but not the z2 instance.

The following example shows a different way to define a singleton method, with the same result.


class Zen 
end 
 
z1 = Zen.new 
z2 = Zen.new 
 
class << z1 
  def say_hello 
    puts "Hello!" 
  end 
end 
 
z1.say_hello    # Output: Hello! 
z2.say_hello    # Output: NoMethodError: undefined method `say_hello'… 

In the above example, class << z1 changes the current self to point to the metaclass of the z1 object; then, it defines the say_hello method within the metaclass.

Both of the above examples serve to illustrate how singleton methods work. There is, however, an easier way to define a singleton method: using a built-in method called define_singleton_method.


class Zen 
end 
 
z1 = Zen.new 
z2 = Zen.new 
 
z1.define_singleton_method(:say_hello) { puts "Hello!" } 
 
z1.say_hello    # Output: Hello! 
z2.say_hello    # Output: NoMethodError: undefined method `say_hello'… 

We learned earlier that classes are also objects (instances of the built-in class called Class). We also learned about class methods. Well, class methods are nothing more than singleton methods associated with a class object. The following example was already seen in the section about class methods. After learning about metaclasses, we may look at it again with a deeper understanding.


class Zabuton 
  class << self  
    def stuff 
      puts "Stuffing zabuton…" 
    end 
  end 
end 

All objects may have metaclasses. That means classes can also have metaclasses. In the above example, class << self modifies self so it points to the metaclass of the Zabuton class. When a method is defined without an explicit receiver (the class/object on which the method will be defined), it is implicitly defined within the current scope, that is, the current value of self. Hence, the stuff method is defined within the metaclass of the Zabuton class. The above example is just another way to define a class method. IMHO, it's better to use the def self.my_new_clas_method syntax to define class methods, as it makes the code easier to understand. The above example was included so we understand what's happening when we come across the class << self syntax.

Ruby provides other useful built-in methods for handling singleton methods and metaclasses.

singleton_class: Returns the singleton class of an object. If there is no singleton class, one is created.


Zen.singleton_class    # Output: => #<Class:Zen> 

singleton_method: Looks for a singleton method and returns a corresponding method object, which can be stored in a variable, passed around, and called with the call method.


s = z1.singleton_method(:say_hello) 
s.call    # Output: => Hello! 

singleton_methods: Returns an array of names of the singleton methods associated with a specific object.


z1.singleton_methods  # Output: => [:say_hello]  

Modules as mixins


Modules are used as namespaces and as mixins. This post only covers (briefly) mixins. Using modules for namespacing is well explained in this post at the Practicing Ruby website.

We learned before that Ruby does not support multiple inheritance. However, there are cases where a class would benefit by acquiring methods defined within multiple other classes. That is made possible by using a construct called module. A module is somewhat similar to a class, except it does not support inheritance, nor instantiating. It is mostly used as a container for storing multiple methods. One way to use a module is to employ an include or extend statement within a class. That way, the class gains access to all methods and objects defined within the module. It is said that the module is mixed in the class. So, a mixin is just a module included in a class. A single module can be mixed in multiple classes, and a single class can mix in multiple modules; thus, any limitations imposed by Ruby's single inheritance model are eliminated by the mixin feature.

All modules are instances of the Module class.


module Foo 
end 
 
Foo.instance_of? Module  # Output: => true 

In the following example, the JapaneseGreetings module is included (as a mixin) in the Person class.


module JapaneseGreetings 
  def  hello 
    puts "Konnichiwa" 
  end 
 
  def goodbye 
    puts "Say┼Źnara" 
  end 
end 
 
class Person 
  include JapaneseGreetings  
end 
 
p = Person.new 
p.hello  # Output: Konnichiwa 
p.goodbye  # Output: Say┼Źnara 

Modules deserve a post of their own; this is only a brief introduction.

Classes are modules


Remember how all classes are instances of a built-in class called Class? Well, Class is a subclass of Module (another built-in class).


Class.superclass  # Output: => Module 

Most of the built-in methods used to manipulate classes are defined in the Module class. Notice how the following list includes many of the methods discussed in this post.


Module.instance_methods(false) 
 => [:<=>, :module_exec, :class_exec, :<=, :>=, :==, :===, :include?, :included_modules, :ancestors, :name, 
 :public_instance_methods, :instance_methods, :private_instance_methods, :protected_instance_methods, :const_get,
 :constants, :const_defined?, :const_set, :class_variables, :class_variable_get, :remove_class_variable, 
 :class_variable_defined?, :class_variable_set, :private_constant, :public_constant, :singleton_class?, 
 :deprecate_constant, :freeze, :inspect, :module_eval, :const_missing, :prepend, :method_defined?, :class_eval,
 :public_method_defined?, :private_method_defined?, :<, :public_class_method, :>, :protected_method_defined?,
 :private_class_method, :to_s, :autoload, :autoload?, :instance_method, :public_instance_method, :include] 

As we already learned, it is standard practice to implement "generic" code (which can be used in different contexts) in the superclass and add extra specialization within subclasses. An example is how the Class class inherits all the above instance methods from the Module class and implements three additional methods.


Class.instance_methods(false) 
 => [:new, :allocate, :superclass] 

The allocate method allocates memory and creates a new "empty" instance of the class, without calling the initialize method. The new method calls allocate, then invokes the initialize method on the newly created object. As for superclass, it returns the name of the superclass of a given class.

In his book The Ruby Programming Language, Yukihiro Matsumoto (the creator of Ruby, AKA Matz) demonstrates how the new method would look like if it were written in Ruby:


def new(*args) 
  o = self.allocate # Create a new object of this class 
  o.initialize(*args) # Call the object's initialize method with our args 
  o # Return new object; ignore return value of initialize 
end 

In brief, we might say that classes are modules with two significant extra functionalities: inheritance and instantiation. Another difference is that, unlike modules, classes cannot be used as mixins.

Thank you for reading. If this post was useful, consider subscribing to our mailing list (click "Subscribe" at the top right of the page) to be notified when new posts are published.

Saturday, June 11, 2016

Ruby Iterators, Enumerators, Enumerable, and Loops

Iterators


Iterator methods are available in collection objects such as arrays and hashes. The most widely used iterator method is each.


[ 1, 2, 3 ].each { |n| puts n }
Output:

1
2
3

In the examples above, the each method is called on an array. It takes a block as an argument and runs the code within the block on each element of the array. At each iteration, the value of n (which is passed to the block as an argument) corresponds to one item of the array. The code inside the block will print each array value as it is received from the each method. Instead of printing the array values, we could do a number of other things within the block. In other words, the iterator's job is to deliver each array item to the block and the block contains the code that will run on these items.

To iterate means to do the same thing multiple times. However, in Ruby, the term iterator is used in different ways. In this post, we will call iterator any method that expects a block and iterates (loops) through items in a collection.

As explained in this post about blocks, the { } is interchangeable with do..end. The above example can also be written like this:


[ 1, 2, 3 ].each do|n|
    puts n
end

Number (integer) iterators


The Integer class provides some useful numeric iterators. We won't go into details here as their names are pretty self-explanatory. The most widely used are the following:


3.times {  print "hello " }  # Output: hello hello hello

3.upto(10) { |n| print "#{n} " }  # Output: 3 4 5 6 7 8 9 10

10.downto(3) { |n| print "#{n} " }  # Output: 10 9 8 7 6 5 4 3

3.step(10, 2) { |n| print "#{n} " }  # Go from 3 to 10 in steps of 2. Output: 3 5 7 9

Note that calling the each and reverse_each methods on a range (an instance of the Range class) will yield the same result as the upto and downto methods. Example:


r = (3..10)

r.each { |n| print "#{n} " }  # Output: 3 4 5 6 7 8 9 10

r.reverse_each { |n| print "#{n} " }  # Output: 10 9 8 7 6 5 4 3

String Iterators


These method names are also self-explanatory.


s = "What is \nthe sound \nof silence?"  # Notice the line breaks (\n).

each_char

s.each_char { |x| puts x } 
Output:

W
h
a
t

i
s
# output truncated here

each_line

s.each_line { |x| puts x }
Output:

What is
the sound
of silence?

Strings also provide iterator methods called each_byte and each_codepoint.

We can also iterate through the letters of the alphabet using a range object.


("A".."E").each { |x| puts x }
Output:

A
B
C
D
E

Array Iterators


The iterator methods covered in this section are defined within the Array class. The Enumerable module, which is explained further below, provides dozens of other iterator methods which can also be used on arrays.


a = [ "zazen", "kinhin", "koan" ]

a.each { |x| puts x }
Output:

zazen
kinhin
koan

The each_index method is like each but will return the item's index instead of its value.

a.each_index { |x| puts x } 
Output:

0
1
2

Hash Iterators


The following iterators are defined within the Hash class. The Enumerable module (discussed below), offers many more iterator methods which can also be used on hashes.

Let's create a test hash to use on the following examples:


h = { "meditation": "zazen", "time": 40, "posture": "kekkafuza" }

Iterate through the hash keys with each_key:


h.each_key { |key| puts key }
Output:

meditation
time
posture

Iterate through the hash values with each_value:


h.each_value{ |key| puts key }
Output:

zazen
40
kekkafuza

Iterate through the hash keys and values:


h.each { |key,value| puts "#{key}: #{value}" }
Output:

meditation: zazen
time: 40
posture: kekkafuza

The each_pair method is an alias to the each method. Let's confirm:


Hash.instance_method(:each) == Hash.instance_method(:each_pair)  # Output: => true
 

The Enumerator class


Most iterator methods rely on the each method under the hood. Hence, it is useful to learn how it works.

When the each method is called and no block is provided, it returns an enumerator object, which is an instance of the Enumerator class.


e =  [ 1, 2, 3, 4, 5 ].each  # Output: => #<Enumerator: [1, 2, 3, 4, 5]:each>

Let's look at the methods provided by the Enumerator class:


Enumerator.instance_methods(false)
 => [:size, :each, :next, :rewind, :with_index, :with_object, :next_values, :peek_values, :peek, :feed]
 

Notice that the false flag was passed to instance_methods so only the methods implemented within the Enumerator class are shown (inherited methods are omitted).

Let's test some of the methods listed above:


e.size  # Return the number of items in the enumerator
 => 5
e.next  # Return the next item and move the internal position forward
 => 1
e.next
 => 2
e.peek  # Return the next item without moving the internal position
 => 3
e.next
 => 3
e.rewind  # Set the internal position to the first item
e.next
 => 1
 

As seen in the above example, enumerator objects provide an easy way to iterate through a collection, such as an array or hash.

Enumerators provide internal and external iteration. Internal means an iterator method will "drive" the iteration. An example is how the each or map methods automatically yield each item in a collection to a block of code. External iteration is when we control the iteration, as in the above example where we called next to get the next value in the collection and so on.

The Enumerable Module


In Ruby, almost everything is an object, a couple of relevant exceptions are methods and blocks. An object is an instance of a class. Hence, all arrays are instances of the Array class; hashes are instances of the Hash class and so on.

Collection classes such as Array, Hash, and Range provide the each method (explained above), which yields the values in the collection, one by one. There is a built-in module called Enumerable, which is used as a mixin by collection classes. It provides multiple methods for working with collections. Every time we create an array or a hash, the methods provided by the Enumerable module are available. These additional methods rely on the each iterator implemented in the corresponding collection class.

If we create a custom class and include a custom method called each, we can use the Enumerable mixin to add extra collection-related functionality. However, this is outside the scope of this post.

We can easily confirm that the most popular collection classes use the Enumerable mixin:


Array.included_modules # Output: => [Enumerable, Kernel]
Hash.included_modules # Output: => [Enumerable, Kernel]
Range.included_modules # Output: => [Enumerable, Kernel]

The full list of methods provided by Enumerable is as follows:


Enumerable.instance_methods(false).sort
 => [:all?, :any?, :chunk, :chunk_while, :collect, :collect_concat, :count, :cycle, :detect, :drop, :drop_while,
 :each_cons, :each_entry, :each_slice, :each_with_index, :each_with_object, :entries, :find, :find_all, 
 :find_index, :first, :flat_map, :grep, :grep_v, :group_by, :include?, :inject, :lazy, :map, :max, :max_by, 
 :member?, :min, :min_by, :minmax, :minmax_by, :none?, :one?, :partition, :reduce, :reject, :reverse_each, 
 :select, :slice_after, :slice_before, :slice_when, :sort, :sort_by, :take, :take_while, :to_a, :to_h, :zip]

This post covers the most commonly used methods from the above list.

Class implementations of methods provided by the Enumerable module


As seen above, collection classes include the methods provided by the Enumerable module. However, some classes provide their own implementations of some of these methods. When a module is included in a class as a mixin, any methods defined in the class will not be overwritten by methods of the same name defined in the module. Looking into this helps us understand why some Enumerable methods behave differently when called on different objects (which are instances of different classes).

An example is the select method, provided by the Enumerable module. Both Array and Hash classes use their own implementations. In contrast, the Range class uses the implementation from Enumerable.


Array.instance_method(:sort)  # Output: => #<UnboundMethod: Array#sort>
Hash.instance_method(:sort)  # Output: => #<UnboundMethod: Hash(Enumerable)#sort>
Range.instance_method(:sort)  # Output: => #<UnboundMethod: Range(Enumerable)#sort>

Methods provided the Enumerable module


We will use arrays as examples because of their simplicity, but the methods below can also be used on hashes and other collection objects.

Loop through indexes and values


The each_with_index method yields each item's index number and its corresponding value.


a = [ "zazen", "kinhin", "koan" ]

a.each_with_index { | value, index | puts "#{index} #{value}" } 
Output:

0 zazen
1 kinhin
2 koan

Loop through items and modify them


What if we need to modify the values of collection items? The each method seen above cannot do that, but the map method can. That is one of the most important and widely used iterator methods in Ruby. It passes each collection item to the block and returns a new collection containing the values returned by the block.


a.map { |x| x.upcase }  # Output: => ["ZAZEN", "KINHIN", "KOAN"]

We can also pass a method reference (as a symbol) to the map method, instead of a block. The result is the same as if we passed a block and, within the block, applied that method to each item in the collection. The above example can be rewritten like this:


[ "zazen", "kinhin" ].map &:upcase  # Output: => ["ZAZEN", "KINHIN"]

A longer explanation is available at the "Ampersand and object (&:method)" section of this post about methods.

The map! method is the same as map, except it will alter the collection items in-place instead of returning a new collection. In other words, it passes each collection item to the block and replaces its original value with the value returned by the block.

The collect and collect! methods are aliases to map and map!, so they can be used interchangeably. Let's verify:


Array.instance_method(:map) == Array.instance_method(:collect)  # Output: => true
Array.instance_method(:map!) == Array.instance_method(:collect!)  # Output: => true

Test if all items meet specific criteria


all?


[ 2, 4, 6 ].all? { |x| x.even? }  #Output: => true

If a block containing criteria is not provided, checks if all collection values are truthy. Remember, in Ruby only nil and false (boolean) are "falsy". This is useful to test a collection for nil values.


[ 1, 2, nil, 5 ].all? # Output: => false

Search collection (find items that meet specific criteria)


any?


[ "foo", "baar" ].any? { |x| x.length > 3 } # Output: => true

If a block containing criteria is not provided, checks if there are any truthy values in the collection.


[ nil, false, "foo" ].any? # Output => true

none?

Opposite of any?. Returns true if the block returns false for all elements in the collection. If a block is not provided, returns true if all values in the collection are falsey (false or nil).

include?

Returns true if the collection includes the value provided as an argument.


["zazen", "shamata", "tonglen" ].include?("tonglen")  # Output: => true

The member? method is an alias to the include? method, at least in the Range class:


Range.instance_method(:include?) == Range.instance_method(:member?)  # Output: => true
 

Count items matching criteria


count


a = [ "foo", "bar", "baz", "foo" ]
a.count("foo") # Output: => 2

a = [ 2, 5, 6, 8, 12 ]
a.count {|i| i.even?} # Output: => 4

Count occurrences of collection items


What if we want to count the occurrences of each item in a collection? In other words, to generate a list of unique values and count the occurrences of each value in the collection? There are several ways to do this, twelve of them are discussed and tested in this very informative post @ Carol's 10 Cents blog. I have chosen a couple of approaches to include here, based on simplicity and performance.


arr = [ "foo", "bar", "baz", "foo", "foo", "baz" ] # Set up test array

Solution 1 (best performance):


Hash[arr.group_by(&:itself).map {|key, value| [key, value.size] }]
# Output: => {"foo"=>3, "bar"=>1, "baz"=>2}

The itself method was introduced in Ruby 2.2; it returns the object it was called on. It is useful mostly for chaining methods.

Solution 2 (easy to understand, slight loss in performance):


count = Hash.new 0; arr.each { |arr_value| count[arr_value ] += 1 }; count
# Output: => {"foo"=>3, "bar"=>1, "baz"=>2}

By default, when we try to access an inexistent key in a hash, it returns nil. By providing 0 as an argument to Hash.new, it returns 0 instead.


h = Hash.new; h["foo"] #Output: => nil
h = Hash.new 0; h["foo"]  #Output: => 0

In the above example, every hash key corresponds to the value of an array item. When we encounter a value for the first time (in the array), we try to find an item with the corresponding key in the hash. However, it does not exist yet, so the hash returns 0. Then, the new hash key is created and its default value (0) is incremented by 1. Every time an array value with a matching hash key is found, the value of the corresponding hash item is incremented.

Select (filter) multiple collection items


Non-destructive selection:

These methods return a new collection containing only the selected items. None of them will modify the original collection.

select

Returns a new collection containing the items that meet the criteria defined within the block. In other words, the items for which the block returned true.


arr = [1, 2, 3, 4, 5, 6, 1, 2, 3, 8]
arr.select { |a| a > 3 }  # Output: => [4, 5, 6, 8]

The difference between the find_all and the select methods is that find_all will always return an array, regardless of the type of object it was called on. That can be demonstrated by calling both methods on a hash:


h = { a: 1, b: 2, c: 3 }

h.find_all { true } # => [[:a, 1], [:b, 2], [:c, 3]]
h.select { true } # => {:a=>1, :b=>2, :c=>3}

reject

Inverse of select. It returns the items for which the block returned false.


arr = [1, 2, 3, 4, 5, 6, 1, 2, 3, 8]
arr.reject { |a| a < 3 } # Output: => [3, 4, 5, 6, 3, 8]

grep

Uses the === operator to return a new collection containing the items whose values match the given expression. The === operator is explained in this post about operators.


a = [ "foo", 3, "bar", 7, "baz", 10, "qux" ] 

a.grep(/ba/) # Output: => ["bar", "baz"]
# is equivalent to:
a.select { |x| /ba/ === x }

Using the === operator under the hood makes grep powerful and flexible.


a = [ "foo", 3, "bar", 7, "baz", 10, "qux" ] 
a.grep(1..8)  # Output: => [3, 7]
a.grep(Integer) # Output: => [3, 7, 10]
a.grep(String) # Output: => ["foo", "bar", "baz", "qux"]

The grep method may also take a block. It yields each matching item to the block and a new array containing the block's output is returned. That is useful for applying operations only to items whose values match the regex.


a.grep(/ba/) { |x| x.upcase } # Output:  => ["BAR", "BAZ"]
# is equivalent to
a.select { |x| /ba/ === x }.map { |x| x.upcase }

grep_v

Reverse grep. Returns a new array containing the items that do not match the pattern.


[ "foo", "bar", "baz", "qux" ].grep_v(/ba/) # Output: => ["foo", "qux"]

drop_while and take_while

The drop_while and take_while methods are similar to reject and select, except they will stop looking once the first item that meets the specified criteria is reached.

There are cases when these methods offer a big performance increase compared to select and reject. For instance, if we have an array consisting of a sorted temperature range from -50 to +50 with 100.000 items in between and we want to take or drop all items with values below or above a certain threshold; take_while and drop_while will yield much better performance as they will stop evaluating once the specified threshold (temperature) is reached.


arr = [1, 2, 3, 4, 5, 6, 1, 2, 3, 8]
arr.take_while { |a| a < 4 }  # Output: => [1, 2, 3]

Notice that once it reached the first occurrence of a number < 4, it stopped looking. It didn't evaluate the following items, so the values 1, 2, 3 in the second half of the array were not returned.


arr.drop_while { |a| a < 3 }  # Output: => [3, 4, 5, 6, 1, 2, 3, 8]

Notice how the numbers 1, 2 and 3 in the second half of the array were not dropped.

Destructive selection

These methods will modify the collection in-place. Use with caution.

The select! and reject! methods are similar to select and reject. The only difference is they alter the original array instead of returning a new one.

The delete_if method is similar to reject! and keep_if is similar to select!.

Find (detect) item in collection


find

Returns the first value for which the expression within the block is true.


[ 1, 3, 6, 8, 10 ].find { |n| n > 5 } # Output: => 6

The detect method is an alias to find.


Array.instance_method(:detect) == Array.instance_method(:find)  # Output: => true

find_index

Same as find, except it returns the index of the item instead of its value.


[ "foo", "foo", "bar", "baz" ].find_index { |x| x.include? "ba" } # Output: => 2

Return the first or last (n) collection items


first


a = [ "foo", "bar", "baz" ]
a.first   # Output: => "foo"

First can also take an argument and return the first n items:


a.first(2) # Output: => ["foo", "bar"]

take

Yields the same result as the first method when used with an argument. The differences between them are: a) unlike first, take requires an argument; b) Enumerator::Lazy provides a lazy version of the take method, but not the first method, so first is always greedy. Lazy enumerators are explained further below.

last (implemented in the Array and Range classes)

Enumerable does not provide a method to return the last item of a collection. The Array and Range classes provide a method called last which will do just that.


[ "foo", "bar", "baz" ].last   # Output: => "baz"
(1..8).last    # Output: => 8

The Hash class does not provide such method, however, we can convert the hash keys, values or both into an array and use the last method from the Array class:


h = { "a": 1, "b": 2, "c": 3 }
h.values.last  # Output: => 3
h.keys.last  # Output: => :c
h.to_a.last  # Output: => [:c, 3]

Reduce/fold a collection (e.g., sum all items)


reduce and inject

Applies a binary operation (such as sum or division) to each collection item and stores the new value in the accumulator variable (memo). In other words, it reduces the collection to a single item. Hence, the name reduce. In math and some other programming languages, this operation is known as fold.

In the following example, n is the array item and sum is the accumulator.


[ 2, 3, 5 ].reduce { |sum, n| sum + n }  # Output: => 10

In Ruby, classes such as Fixnum (subclass of Integer) and Float provide methods like + (add), - (subtract), * (multiply) and / (divide). Instead of a block, the reduce class can take a reference to one of these methods as an argument. The given method is applied to all items in the collection and the accumulated result it returned.


[ 2, 3, 5 ].reduce(:+)  # Output: => 10
[ 2, 3, 5 ].reduce(:*)  # Output: => 30

The inject method is an alias to reduce. Let's verify:


Array.instance_method(:reduce) == Array.instance_method(:inject)  # Output: => true
 

Lazy evaluation


Ruby has a feature called lazy evaluation. Providing a thorough explanation would probably take an entire post. Briefly, it is an efficient way to get an arbitrary number of values from a very large or infinite collection.

Iterator methods are eager by default. That means they process all collection items before returning anything. In the following example, we attempt to multiply the first ten items of an infinite array by 2. It does not work as it tries to multiply all (infinite) items first and then return the first ten items. It generates an infinite loop.


1.upto(Float::INFINITY).map { |x| x * 2 }.take(10).to_a

Now let's include the lazy method and try again.


1.upto(Float::INFINITY).lazy.map { |x| x * 2 }.take(10).to_a
# Output: => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

By introducing the lazy method, we have created a lazy enumerator, which is an instance of the Enumerator::Lazy class, introduced in Ruby 2.0. Lazy enumerators will only evaluate (process) the required amount of items to generate the desired output.

Lazy enumerators implement lazy versions of many Enumerable methods, as seen below.


Enumerator::Lazy.instance_methods(false).sort
 => [:chunk, :collect, :collect_concat, :drop, :drop_while, :enum_for, :find_all, :flat_map, :force, :grep,
 :grep_v, :lazy, :map, :reject, :select, :slice_after, :slice_before, :slice_when, :take, :take_while, 
 :to_enum, :zip]

It may be easier to grasp the above example by splitting it into two steps:


l = 1.upto(Float::INFINITY).lazy # Output: => #<Enumerator::Lazy: #<Enumerator: 1:upto(Infinity)>>
l.map { |x| x * 2 }.take(10).to_a # Output: => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Find the item with the maximum or minimum value


The Enumerable mixin provides methods for sorting collection items and finding those with the highest or lowest values. All of them use the Comparable mixin and the <=> (spaceship) operator under the hood, which are both covered in this post about operators.

max

Return the greatest (maximum) value in a collection.


(1..10).max # Output: => 10

Return the three greatest values in descending order.


(1..10).max(3) # Output: => [10, 9, 8]

max_by

Return the greatest value according to specific criteria defined within a block. In the following example, the longest string.


[ "zen", "zazen", "liberation" ].max_by { |x| x.length} # Output: => "liberation"

min

The min method is the inverse of max. It returns the lessest (minimum) value in a collection.

min_by

The min_by method works the same way as the max_by method, except it returns the minimum value in a collection according to specific criteria.

Iterate backwards


Reverse each

We already discussed the each method; reverse_each is the same, except it will iterate from the last item to the first.


[1,2,3].reverse_each { |x| print x }  # Output: 321

Sort items


Most sorting operations use the spaceship (<=>) operator, explained in in this post about operators.

sort

Sorts collection items. Numbers are sorted in ascending order and strings in ascending alphabetical order. Strings beginning with uppercase characters (e.g., Foo or FOO) always come before those beginning with lowercase characters. Example:


[ "foo", "Foo", "Bar", "bar", "qux", "Qux" ].sort
# Output: => ["Bar", "Foo", "Qux", "bar", "foo", "qux"]

Note that it's not possible to sort a collection containing both numbers and strings unless the numbers are stored as strings.


["foo", 2, "bar"].sort
# Output: ArgumentError: comparison of Fixnum with String failed

sort_by

Sort by specific criteria. The block must return a number for each item of the collection. The items will be sorted according to those numbers. In the following example, the items are sorted by length.


[ "liberation", "zen", "zazen",  ].sort_by { |word| word.length}
# Output: => ["zen", "zazen", "liberation"]

Convert collection into an array


The to_a method converts any collection into an array.


{ meditation: "zazen", time: 40 }.to_a  # Output: => [[:meditation, "zazen"], [:time, 40]]
(1..6).to_a  # Output => [1, 2, 3, 4, 5, 6]

The entries method is an alias to to_a, at least in the Range class.


Range.instance_method(:to_a) == Range.instance_method(:entries)  # Output: => true
 

Iterate (loop) through two arrays simultaneously


The zip method is one way to iterate over two arrays at once.


a = [ "zazen", "kinhin", "koan" ]
b = [ "sit", "walk", "contemplate" ]

a.zip(b).each { | x, y | puts "#{x} - #{y}" }
Output:

zazen - sit
kinhin - walk
koan - contemplate

Other loops


The for, while and until loops presented below are the usual way of looping through collections in many programming languages. However, in Ruby they are frowned upon and not used frequently. Most rubyists prefer using the iterators explained above, such as each and map, to loop through collections.

The for loop


For loops call the each method of the collection under the hood, which passes the value of each item to the loop, which in turn assigns the value to the loop variable.

The loop variable and any other variables defined within the loop will remain defined after it ends.


for i in 0..5
   print i
end

# Output: 012345

The while loop


The while loop keeps running while a condition is true (until it becomes false).


n = 0
while n < 10 do  # Run while n is lesser than 10
    print n
    n += 1  # Increment n by 1 at each iteration of the loop
end

# Output: 0123456789

While loops can also be used as modifiers. Modifiers allows us to append a conditional or loop statement (e.g., if, unless, while, until) onto the end of another statement, which will be executed conditionally or as a loop.


n = 0
print n += 1 while n < 10

# Output: 12345678910

Notice that the output of the two examples above is different. In the first example, n is displayed by print; then it's incremented by the n +=1 statement. In the second example, even though the while modifier is positioned after the print n +=1 statement, it evaluates the condition (n < 10) before executing the print statement. However, the value of n is incremented before being displayed by print; that's why it starts at 1 instead of 0. It ends at 10 instead of 9 because when the while condition is evaluated in the last iteration of the loop, the value of n is 9 but, before it is displayed by the print method, it is incremented to 10 by the n +=1 statement.

The until loop


The until loop is the inverse of the while loop. It runs until a condition becomes true (while it's false):


n = 0
until n > 10 do  # Run until n is greater than 10
    print n
    n += 1  # Increment n by 1 at each iteration of the loop
end

# Output: 012345678910

Until loops can also be used as modifiers:


n = 0
print n += 1 until n > 10

# Output: 1234567891011

Both while and until loops will run until the condition for terminating is met. That is a problem when the condition is never met, and the loop runs indefinitely. That is called infinite loop and ir can make the entire system unresponsive. Hence, while and until loops should be used with caution.

Saturday, May 28, 2016

Ruby operators: equality, comparison, pattern matching and ordering

An operator is a character or a small set of characters that represent an action which is applied to one or more operands. Ruby provides many different kinds of operators; this post covers equality, comparison, pattern matching and ordering operators, all of which are implemented as methods. When one of those operators is reached within the code, it calls the corresponding method. When we type 2+3, we are actually calling the + method of the Integer class and providing 3 as an argument. We can rewrite it as 2.+(3) and get the same result.

Operator overloading means changing the behavior of an operator by overriding its corresponding method. That is, however, outside the scope of this post.

Equality operators: == and !=


The == operator, also known as equality or double equal, will return true if both objects are equal and false if they are not.


"koan" == "koan" # Output: => true

The != operator, AKA inequality or bang-tilde, is the opposite of ==. It will return true if both objects are not equal and false if they are equal.


"koan" != "discursive thought" # Output: => true

Note that two arrays with the same elements in a different order are not equal, uppercase and lowercase versions of the same letter are not equal and so on.

When comparing numbers of different types (e.g., integer and float), if their numeric value is the same, == will return true.


2 == 2.0 # Output: => true

Additional methods for testing equality


equal?


Unlike the == operator which tests if both operands are equal, the equal method checks if the two operands refer to the same object. This is the strictest form of equality in Ruby.

Example:

a = "zen"
b = "zen"

a.object_id  # Output: => 20139460
b.object_id  # Output :=> 19972120

a.equal? b  # Output: => false

In the example above, we have two strings with the same value. However, they are two distinct objects, with different object IDs. Hence, the equal? method will return false.

Let's try again, only this time b will be a reference to a. Notice that the object ID is the same for both variables, as they point to the same object.


a = "zen"
b = a

a.object_id  # Output: => 18637360
b.object_id  # Output: => 18637360

a.equal? b  # Output: => true

eql?


In the Hash class, the eql? method it is used to test keys for equality. Some background is required to explain this. In the general context of computing, a hash function takes a string (or a file) of any size and generates a string or integer of fixed size called hashcode, commonly referred to as only hash. Some commonly used hashcode types are MD5, SHA-1, and CRC. They are used in encryption algorithms, database indexing, file integrity checking, etc. Some programming languages, such as Ruby, provide a collection type called hash table. Hash tables are dictionary-like collections which store data in pairs, consisting of unique keys and their corresponding values. Under the hood, those keys are stored as hashcodes. Hash tables are commonly referred to as just hashes. Notice how the word hash can refer to a hashcode or to a hash table. In the context of Ruby programming, the word hash almost always refers to the dictionary-like collection.

Ruby provides a built-in method called hash for generating hashcodes. In the example below, it takes a string and returns a hashcode. Notice how strings with the same value always have the same hashcode, even though they are distinct objects (with different object IDs).


"meditation".hash  # Output: => 1396080688894079547
"meditation".hash  # Output: => 1396080688894079547
"meditation".hash  # Output: => 1396080688894079547

The hash method is implemented in the Kernel module, included in the Object class, which is the default root of all Ruby objects. Some classes such as Symbol and Integer use the default implementation, others like String and Hash provide their own implementations.


Symbol.instance_method(:hash).owner  # Output: => Kernel
Integer.instance_method(:hash).owner # Output: => Kernel

String.instance_method(:hash).owner  # Output: => String
Hash.instance_method(:hash).owner  # Output: => Hash

In Ruby, when we store something in a hash (collection), the object provided as a key (e.g., string or symbol) is converted into and stored as a hashcode. Later, when retrieving an element from the hash (collection), we provide an object as a key, which is converted into a hashcode and compared to the existing keys. If there is a match, the value of the corresponding item is returned. The comparison is made using the eql? method under the hood.


"zen".eql? "zen"    # Output: => true
# is the same as
"zen".hash == "zen".hash # Output: => true

In most cases, the eql? method behaves similarly to the == method. However, there are a few exceptions. For instance, eql? does not perform implicit type conversion when comparing an integer to a float.


2 == 2.0    # Output: => true
2.eql? 2.0 # Output: => false
2.hash == 2.0.hash # Output: => false

Case equality operator: ===


Many of Ruby's built-in classes, such as String, Range, and Regexp, provide their own implementations of the === operator, also known as case-equality, triple equals or threequals. Because it's implemented differently in each class, it will behave differently depending on the type of object it was called on. Generally, it returns true if the object on the right "belongs to" or "is a member of" the object on the left. For instance, it can be used to test if an object is an instance of a class (or one of its subclasses).


String === "zen"  # Output: => true
Range === (1..2)   # Output: => true
Array === [1,2,3]   # Output: => true
Integer === 2   # Output: => true

The same result can be achieved with other methods which are probably best suited for the job, such as is_a? and instance_of?.

Range Implementation of ===


When the === operator is called on a range object, it returns true if the value on the right falls within the range on the left.


(1..4) === 3  # Output: => true
(1..4) === 2.345 # Output: => true
(1..4) === 6  # Output: => false

("a".."d") === "c" # Output: => true ("a".."d") === "e" # Output: => false

Remember that the === operator invokes the === method of the left-hand object. So (1..4) === 3 is equivalent to (1..4).=== 3. In other words, the class of the left-hand operand will define which implementation of the === method will be called, so the operand positions are not interchangeable.

Regexp Implementation of ===


Returns true if the string on the right matches the regular expression on the left.

/zen/ === "practice zazen today"  # Output: => true
# is similar to
"practice zazen today"=~ /zen/

The only relevant difference between the two examples above is that, when there is a match, === returns true and =~ returns an integer, which is a truthy value in Ruby. We will get back to this soon.

Implicit usage of the === operator on case/when statements


This operator is also used under the hood on case/when statements. That is its most common use.


minutes = 15

case minutes
  when 10..20
    puts "match"
  else
    puts "no match"
end

# Output: match

In the example above, if Ruby had implicitly used the double equal operator (==), the range 10..20 would not be considered equal to an integer such as 15. They match because the triple equal operator (===) is implicitly used in all case/when statements. The code in the example above is equivalent to:


if (10..20) === minutes
  puts "match"
else
  puts "no match"
end

Pattern matching operators: =~ and !~


The =~ (equal-tilde) and !~ (bang-tilde) operators are used to match strings and symbols against regex patterns.

The implementation of the =~ method in the String and Symbol classes expects a regular expression (an instance of the Regexp class) as an argument.


"practice zazen" =~ /zen/   # Output: => 11
"practice zazen" =~ /discursive thought/ # Output: => nil

:zazen =~ /zen/    # Output: => 2
:zazen =~ /discursive thought/  # Output: => nil

The implementation in the Regexp class expects a string or a symbol as an argument.


/zen/ =~ "practice zazen"  # Output: => 11
/zen/ =~ "discursive thought" # Output: => nil

In all implementations, when the string or symbol matches the Regexp pattern, it returns an integer which is the position (index) of the match. If there is no match, it returns nil. Remember that, in Ruby, any integer value is "truthy" and nil is "falsy", so the =~ operator can be used in if statements and ternary operators.


puts "yes" if "zazen" =~ /zen/ # Output: => yes
"zazen" =~ /zen/?"yes":"no" # Output: => yes

Pattern-matching operators are also useful for writing shorter if statements. Example:


if meditation_type == "zazen" || meditation_type == "shikantaza" || meditation_type == "kinhin" 
  true
end
Can be rewritten as:

if meditation_type =~ /^(zazen|shikantaza|kinhin)$/
  true
end

The !~ operator is the opposite of =~, it returns true when there is no match and false if there is a match.

Comparison operators


Objects such as numbers and strings, which can be compared (amongst themselves) in terms of being greater or smaller than others, provide the <=> method, also known as the spaceship method. When comparing two objects, <=> returns -1 if the first object is lesser than the second (a < b), 0 in case they are equal (a == b) and 1 when the first object is greater than the second (a > b).


5 <=> 8  # Output:  => -1
5 <=> 5 # Output: => 0
8 <=> 5 # Output: => 1

Most comparable or sortable object classes, such as Integer, Float, Time and String, include a mixin called Comparable, which provides the following comparison operators: < (less than), <= (less than or equal), == (equal), > (greater than), >= (greater than or equal). These methods use the spaceship operator under the hood.

Let's find out which classes include the Comparable mixin:


ObjectSpace.each_object(Class).select { |c| c.included_modules.include? Comparable }
Output:

=> [Complex, Rational, Time, File::Stat, Bignum, Float, Fixnum, Integer, Numeric, Symbol, String, Gem::Version, IRB::Notifier::NoMsgNotifier, IRB::Notifier::LeveledNotifier]

Comparison operators can be used in objects of all the above classes, as in the following examples.


# String
"a" < "b" # Output: => true
"a" > "b" # Output: => false

# Symbol
:a < :b  # Output: => true
:a > :b  # Output: => false

# Fixnum (subclass of Integer)
1 < 2  # Output: => true
2 >= 2  # Output: => true

# Float
1.0 < 2.0 # Output: => true
2.0 >= 2.0 # Output: => true

# Time
Time.local(2016, 5, 28) < Time.local(2016, 5, 29) # Output: => true

When comparing numbers of different classes, comparison operators will implicitly perform simple type conversions.


# Fixnum vs. Float
2 < 3.0  # Output: => true
2.0 > 3  # Output: => false