class Cascading::Node

A Node is a Cascade, Flow, or Assembly, all of which are composite structures that describe the hierarchical structure of your job. A Cascade may contain many Flows and a Flow and Assembly may contain many Assemblies (branches in the case of the Assembly). Nodes are named, contain parent and child pointers, and keep track of their children both by name and by insertion order.

Nodes must be uniquely named within the scope of their parent so that they unambiguously looked up for connecting pipes within a flow. However, we only ensure that children are uniquely named upon insertion; full uniqueness isn’t required until #find_child is called (this allows for name reuse in a few limited circumstances that was important when migrating the Etsy workload to enforce these constraints).

Attributes

child_names[RW]
children[RW]
last_child[RW]
name[RW]
parent[RW]

Public Instance Methods

add_child(node) click to toggle source

Children must be uniquely named within the scope of each Node. This ensures, for example, two assemblies are not created within the same flow with the same name, causing joins, unions, and sinks on them to be ambiguous.

# File lib/cascading/base.rb, line 32
def add_child(node)
  raise AmbiguousNodeNameException.new("Attempted to add '#{node.qualified_name}', but node named '#{node.name}' already exists") if @children[node.name]

  @children[node.name] = node
  @child_names << node.name
  @last_child = node
  node
end
desc(offset = '') click to toggle source
Alias for: describe
describe(offset = '') click to toggle source

Produces a textual description of this Node. This method is overridden by all classes inheriting Node, so it serves mainly as a template for describing a node with children.

# File lib/cascading/base.rb, line 50
def describe(offset = '')
  "#{offset}#{name}:node\n#{child_names.map{ |child| children[child].describe("#{offset}  ") }.join("\n")}"
end
Also aliased as: desc
find_child(name) click to toggle source

In order to find a child, we require it to be uniquely named within this Node and its children. This ensures, for example, branches in peer assemblies or branches and assemblies do not conflict in joins, unions, and sinks.

# File lib/cascading/base.rb, line 59
def find_child(name)
  all_children_with_name = find_all_children_with_name(name)
  qualified_names = all_children_with_name.map{ |child| child.qualified_name }
  raise AmbiguousNodeNameException.new("Ambiguous lookup of child by name '#{name}'; found '#{qualified_names.join("', '")}'") if all_children_with_name.size > 1

  all_children_with_name.first
end
qualified_name() click to toggle source

The qualified name of a node is formed from the name of all nodes in the path from the root to that node.

# File lib/cascading/base.rb, line 43
def qualified_name
  parent ? "#{parent.qualified_name}.#{name}" : name
end
root() click to toggle source

Returns the root Node, the topmost parent of the hierarchy (typically a Cascade or Flow).

# File lib/cascading/base.rb, line 69
def root
  return self unless parent
  parent.root
end

Protected Instance Methods

find_all_children_with_name(name) click to toggle source
# File lib/cascading/base.rb, line 76
def find_all_children_with_name(name)
  child_names.map do |child_name|
    children[child_name] if child_name == name
  end.compact + child_names.map do |child_name|
    children[child_name].find_all_children_with_name(name)
  end.flatten
end

Public Class Methods

new(name, parent) click to toggle source

A Node requires a name and a parent when it is constructed. Children are added later with #add_child.

# File lib/cascading/base.rb, line 20
def initialize(name, parent)
  @name = name
  @parent = parent
  @children = {}
  @child_names = []
  @last_child = nil
end