CNK's blog

Adding Library Code to Your Chef Runs

Using Mixins Inside of Chef Recipes

There are a couple of wrinkles about mixing ruby modules into your chef code. The first appears to be that the chef DSL takes over everything (or nearly everything) - including the include command. So using normal ruby like include MyModule inside a recipe file causes a compile time error:

..... omitted ...
resolving cookbooks for run list: ["testcookbook::default"]
Synchronizing Cookbooks:
  - testcookbook
  Compiling Cookbooks...
  [2015-07-09T13:53:52-07:00] WARN: this should log node info bar from 'Chef::Log'

================================================================================
Recipe Compile Error in .../testcookbook/recipes/default.rb
================================================================================

NoMethodError
-------------
No resource or method named `include' for `Chef::Recipe "default"'

That seems odd - but it is probaby a good thing since if it worked, it might end up including your library code into the wrong place. With nearly all libraries, you want your module code available in some specific execution scope - which is probably not the compile time scope. To use our UsefulMethods module in our recipe context, we need the following in our recipe file:

::Chef::Recipe.send(:include, UsefulMethods)

This blog post on the Chef.io site does a really nice job of explaining how (and why) to write libraries and then use them in your recipes. In their example code, the library code needs to be used inside the user resource: Chef::Resource::User.

Creating Modules Inside a Namespace

The second example in the custom libraries section of Customizing Ruby shows another option for how to get your library code exactly where you want it. Instead of defining a generic module and then including it in your recipe, as above, you can set up your library within the namespace in which you want to use it. In the case of our UsefulModules code, we rewrite the library as a class inside the Chef::Recipe namespace:

class Chef::Recipe::StopFile
    def self.stop_file_exists?
        ::File.exists?("/tmp/stop_chef")
    end
end

And then in our recipe file we don’t have to send any message to include the new class. Because it was created inside the Chef::Recipe namespace, it gets loaded into our recipe context when the library file is loaded at the beginning of the chef run. We can just call the class method like so:

if StopFile.stop_file_exists?
   ....

Logging in Chef

There are a couple of different techniques for logging during a chef client run. The simplest option for debugging things in any programming language is by adding print statements - or in the case of Ruby, puts statements (print with a newline added). However, in order for print statements to work, they need to be executed in a context where stdout is available AND where you, the user, can see stdout. When running chef manually (either using chef-client or via test kitchen’s ‘kitchen converge’ command), you are watching output go by on the console. So you can do things like:

puts "This is normal Ruby code inside a recipe file."

And in a client run, you will see that output - in the compile phase.

$ chef-client --once --why-run --local-mode \
              --config /Users/cnk/Code/sandbox/customizing_chef/part3_examples/solo.rb
              --override-runlist testcookbook::default

Starting Chef Client, version 12.3.0
[2015-07-09T16:25:06-07:00] WARN: Run List override has been provided.
[2015-07-09T16:25:06-07:00] WARN: Original Run List: []
[2015-07-09T16:25:06-07:00] WARN: Overridden Run List: [recipe[testcookbook::default]]
resolving cookbooks for run list: ["testcookbook::default"]
Synchronizing Cookbooks:
  - testcookbook
  Compiling Cookbooks...
  This is normal Ruby code inside a recipe file.  ########### this is the message ##########
  Converging 0 resources

Running handlers:
  Running handlers complete
  Chef Client finished, 0/0 resources would have been updated

You can get nearly the same functionality - but with a timestamp and some terminal coloring, if you use Chef::Log in the same context:

puts "This is a puts from the top of the default recipe; node info: #{node['foo']}"
Chef::Log.warn("You can log node info #{node['foo']} from a recipe using 'Chef::Log'")

Gives:

 $ chef-client --once --why-run --local-mode \
               --config /Users/cnk/Code/sandbox/customizing_chef/part3_examples/solo.rb \
               --override-runlist testcookbook::default

 Starting Chef Client, version 12.3.0
 [2015-07-09T16:33:44-07:00] WARN: Run List override has been provided.
 [2015-07-09T16:33:44-07:00] WARN: Original Run List: []
 [2015-07-09T16:33:44-07:00] WARN: Overridden Run List: [recipe[testcookbook::default]]
 resolving cookbooks for run list: ["testcookbook::default"]
 Synchronizing Cookbooks:
   - testcookbook
   Compiling Cookbooks...
   This is a puts from the top of the default recipe; node info: bar
   [2015-07-09T16:33:44-07:00] WARN: You can log node info bar from a recipe using 'Chef::Log'
   Converging 0 resources
Running handlers:
  Running handlers complete
  Chef Client finished, 0/0 resources would have been updated

NB the default log level for chef-client writing messages to the terminal is warn or higher. So if you try to use Chef::Log.debug('something') you won’t see your message unless you have turned up the verbosity. This unexpected feature, caused me a bit of grief initially as I couldn’t find my log messages anywhere. Now what I do is use Chef::Log.warn while debugging locally and then plan to take the messages out before I commit the code.

From my experiments, just about anywhere you might use puts, you can use Chef::Log. I think the later is probably better because it will probably put information into actual log files in contexts like test kitchen that write log files for examining later.

If you need something logged at converge time instead of compile time, you have 2 options, use the log resource, or wrap Chef::Log inside a ruby_block call. In either case, during the compile phase, a new resource gets created and added to the resouce collection. Then during the converge phase, that resource gets executed. Creating a Chef::Log statement inside a ruby_block probably isn’t too useful on its own, though it may be useful if you have created a ruby_block for some other reason. This gist has some example code and the output: https://gist.github.com/cnk/e5fa8cafea8c2953cf91

Anatomy of a Chef Run

Each chef run has 2 phases - the compile phase and the converge phase.

Compile phase

In the compile phase, the chef client loads libraries, cookbooks, and recipess. Then it takes the run list, reads the listed recipes, and buids a collection of the resources that need to be executed in this run. Ruby code within the recipe may alter what resources are added to the resource collection based on information about the node. For example, if the node’s OS family is ‘debian’, package commands need to use ‘apt’ to install packages. So if you are installing emacs, the resource collection on an ubuntu box will have an ‘apt’ resource for installing that package - but the resource collection on a RHEL box will have a ‘yum’ resource instead.

The compile phase also has logic for creating a minimal, ordered collection of resources to run. Part of this process is deduplication. If multiple recipies include apt’s default recipe (which calls ‘apt-get update’), the compile phase adds this to the resource collection once. Any other calls to the same resource are reported in the run output as duplicates.

[2015-07-09T22:34:01+00:00] WARN: Previous bash[pip install to VE]:
  /tmp/kitchen/cookbooks/dev-django-skeleton/recipes/django_project.rb:75:in `from_file'
[2015-07-09T22:34:01+00:00] WARN: Current  bash[pip install to VE]:
  /tmp/kitchen/cookbooks/dev-django-skeleton/recipes/django_project.rb:86:in `from_file'

Converge phase

The converge phase is the phase in which the resource code actually gets run. As the each resource runs, information is added to the run status object - some of which can later be written back to the chef server as the node status at the end of the run.

Run status information

The Customizing Chef book has some useful information about what chef collects in the run status object. For example, the run status object has a reference to the node object at the start of each run (basically node information from the chef server combined with the data collected by ohai). It also has a reference to the run context object:

This object contains a variety of useful data about the overall
Chef run, such as the cookbook files needed to perform the run,
the list of all resources to be applied during the run, and the
list of all notifications triggered by resources during the run.

Excerpt From: "Customizing Chef" chapter 5 by Jon Cowie

Two very useful methods are ‘all_resources’ and ‘updated_resources’. One of the examples on the book is a reporting handler that logs both of those lists to a log file (see Handler Example 2: Report Handler)

Testing in Django

Test Runner

First the good part, Django, by default, uses Python’s built in unittest library - and as of Python 2.7+ that has a reasonable set of built in assertion types. (And for versions of django before 1.8, django backported the python 2.7 unittest library.) Django has a pretty good test discovery system (apparently from the upgraded Python 2.7 unittest library) and will run any code in files matching test*.py. So to run all your test, you just type ./manage.py test at the top of your project. You can also run the tests in individual modules or classes by doing something like: ./manage.py test animals.tests - without having to put the if name == __main__ stuff in each file. You can even run individual tests - though you have to know the method name (more or less like you have to in Ruby’s minitest framework): ./manage.py test animals.tests.AnimalTestCase.test_animals_can_speak

The best part of Django’s test runner is the setup it does for you - creating a new test database for each run, running your migrations, and if you ask it to, importing any fixtures you request. Then, after collection all the discovered tests, it runs each test inside a transaction to provide isolation.

Test Output

But coming from the Ruby and Ruby on Rails worlds, the testing tools in Python and Django are not as elegant as I am used to. At one point I thought the Ruby community’s emphasis on creating testing tools that display English-like output for the running tests bordered on obsessive. But having spent some time in Python/Django which doesn’t encourage tools like that, I have come to appreciate the Rubyist’s efforts. Both RSpec and Minitest have multiple built in output format options - and lots of people have created their own reporter add ons - so you can see your test output exactly the way you want it with very little effort. The django test command allows 4 verbosity levels but for the most part they only change how much detail you get about the setup process for the tests. The only differences in the test output reporting are that you get dots at verbosity levels 0 and 1 and the test names and file locations at levels 2 and 3:

$ python ./manage.py test -v 2

..... setup info omitted .......

test_debugging (accounts.tests.test_models.UserModelTests) ... ok
test_login_link_available_when_not_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
test_logout_link_available_when_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
test_signup_link_available_when_not_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
test_user_account_link_available_when_logged_in (accounts.tests.test_templates.LoginLinkTests) ... ok
test_profile_url (accounts.tests.test_urls.AccuntsURLTests) ... ok
test_signup_url (accounts.tests.test_urls.AccuntsURLTests) ... ok
test_url_for_home_page (mk_web_core.tests.GeneralTests) ... ok

----------------------------------------------------------------------
    Ran 8 tests in 0.069s

So increasing the verbosity level is useful for debugging your tests but disappointing if you are trying to use the tests to document your intentions.

Behave and django-behave

This is the main reason why, despite being unenthusiastic about Cucumber in Ruby, I am supporting using Python’s behave with django-behave for our new project. One of the things I don’t like about cucumber is it all to frequently becomes an exercise in writing regular expressions (for the step matching). I don’t like that if you need to pass state between steps, you set instance variables; this is effective, but it looks kind of like magic.

With ‘behave’, you need to do the same things but in more explicit ways. The step matching involves litteral text with placeholders. If you want to do full regular expression matching you can, but you need to set the step matcher for that step to be ‘re’ - regular expression matching isn’t the default. For sharing state, there is a global context variable. When you are running features and scenarios, additional namespaces get added to the root context object - and then removed again as they go out of scope again. Adding information to the context variable seems more explicit - but with the namespace adding and removing - I am not sure that this isn’t more magical than the instance variables in Cucumber.

Django’s TestCase Encourages Integration Tests

The main testing tool that Django encourages using is it’s TestCase class which tests a bunch of concerns - request options, routing, the view’s context dictionary, response status and template rendering.

It’s odd to me that Django’s docs only discuss integration tests and not proper unit tests. With Django’s Pythonic explicitness, it is fairly easy to set up isolated pieces of the Django stack by just importing the pieces you care about into your test. For example, you can test your template’s rendering by creating a dictionary for the context information and then rendering the template with that context. Harry Percival’s book “Test-Driven Development with Python” does a very nice job of showing you how to unit test the various sections of the Django stack - routing, views, templates, etc.

More than just not discussing isolated unit tests, at least some of Django’s built in assertions actively require you to write a functional / integration test. I tried rendering my template to html and then called assertContains to test some specific html info. But I got an error about the status code! In order to use assertContains on the template, I have to make a view request.

Coming from the Rails world, I don’t really want the simple assertContains matching. What I really want is Rails’ built-in html testing method, assert_select. I found a Django library that is somewhat similar, django-with-asserts. But like assertContains, django-with-assert’s test mixin class uses the Django TestCase as it’s base and so also wants you to make a view request so it can test the StatusCode. I really wanted django-with-asserts functionality but I want to use it in isolation when I can, so I forked it and removed the dependency on the request / response cycle.

A Send-Only Email Server

Our ZenPhoto install wants to be able to notify us when there are new comments. I also may eventually want to set up exception notifications for some of my dynamic sites. At least for now, I don’t want to run a full-blown mail server for our domains; I don’t want to deal with spam detection and restricting who can use the mail server to relay mail, etc. But I know that many of the common Unix email servers can be configured so that they don’t receive mail and only send mail if it originates on one or more specific servers. I don’t have a lot of experience setting up mail servers. The ones I am most familiar with are qmail (which is what ArsDigita used everywhere) and Postfix. I am betting that it will be easier to set up Postfix on Ubuntu so let’s look for some instructions.

Installing Postfix

There are some promising looking instructions on the Digital Ocean site - for Postfix on Ubuntu 14.04. Postfix is apparently the default mail server for Ubuntu because sudo apt-get install mailutils installs postfix as one of the “additional packages”. The install process asked me two questions: what kind of mail server configuration I needed (I chose ‘Internet Site’), and what is the domain name for the mail server. I debated whether I should leave this set to the hostname for the server, which is a subdomain of one of our domains, or if I should set it to just the domain. Tim may have our domain name registrar set up for email forwarding for the domain so it may be slightly safer to configure this mail server with the subdomain. And it will make it a lot clearer to me where the email is coming from.

$ sudo apt-get install mailutils
...
... Lots of install info....
...
Setting up postfix (2.11.0-1ubuntu1) ...
Adding group `postfix' (GID 114) ...
Done.
Adding system user `postfix' (UID 106) ...
Adding new user `postfix' (UID 106) with group `postfix' ...
Not creating home directory `/var/spool/postfix'.
Creating /etc/postfix/dynamicmaps.cf
Adding tcp map entry to /etc/postfix/dynamicmaps.cf
Adding sqlite map entry to /etc/postfix/dynamicmaps.cf
Adding group `postdrop' (GID 115) ...
Done.
setting myhostname: trickster.ictinike.org
setting alias maps
setting alias database
changing /etc/mailname to trickster.ictinike.org
setting myorigin
setting destinations: trickster.ictinike.org, localhost.ictinike.org,
, localhost
setting relayhost:
setting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
setting mailbox_size_limit: 0
setting recipient_delimiter: +
setting inet_interfaces: all
setting inet_protocols: all
/etc/aliases does not exist, creating it.
WARNING: /etc/aliases exists, but does not have a root alias.

Postfix is now set up with a default configuration.  If you need to
make changes, edit /etc/postfix/main.cf (and others) as needed.
To view Postfix configuration values, see postconf(1).

After modifying main.cf, be sure to run '/etc/init.d/postfix reload'.

Running newaliases
 * Stopping Postfix Mail Transport Agent postfix
    ...done.
 * Starting Postfix Mail Transport Agent postfix
    ...done.
Processing triggers for ufw (0.34~rc-0ubuntu2) ...
Processing triggers for ureadahead (0.100.0-16) ...
Setting up mailutils (1:2.99.98-1.1) ...
update-alternatives: using /usr/bin/frm.mailutils to provide /usr/bin/frm (frm) in auto mode
update-alternatives: using /usr/bin/from.mailutils to provide /usr/bin/from (from) in auto mode
update-alternatives: using /usr/bin/messages.mailutils to provide /usr/bin/messages (messages) in auto mode
update-alternatives: using /usr/bin/movemail.mailutils to provide /usr/bin/movemail (movemail) in auto mode
update-alternatives: using /usr/bin/readmsg.mailutils to provide /usr/bin/readmsg (readmsg) in auto mode
update-alternatives: using /usr/bin/dotlock.mailutils to provide /usr/bin/dotlock (dotlock) in auto mode
update-alternatives: using /usr/bin/mail.mailutils to provide /usr/bin/mailx (mailx) in auto mode
Processing triggers for libc-bin (2.19-0ubuntu6.6) ...

Configuring Postfix to only accept mail from localhost

The installer had set up Postfix to listen on all available interfaces. So netstat -ltpn shows

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      2028/mysqld
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      11341/sshd
tcp        0      0 0.0.0.0:25              0.0.0.0:*               LISTEN      15201/master
tcp6       0      0 :::80                   :::*                    LISTEN      2176/apache2
tcp6       0      0 :::22                   :::*                    LISTEN      11341/sshd
tcp6       0      0 :::25                   :::*                    LISTEN      15201/master

So, following the instructions, I edited /etc/postfix/main.cf and changed inet_interfaces = all to inet_interfaces = localhost and restarted the postfix service. Now I see postfix only on the local interface (ipv4 and ipv6):

tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      15405/master
tcp6       0      0 ::1:25                  :::*                    LISTEN      15405/master

I tested email sending with: echo "test email body" | mail -s "Test email" cnk@<destination> and it went through just fine. YEAH!

Now, I need to forward system mail (e.g. root mail) to me. To do this, I need to add a line to /etc/aliases for root + the destination emails. Then I got the new entries in /etc/aliases into /etc/aliases.db by running the newaliases command. I tested the new root works by sending a second test email: echo "test email body" | mail -s "Test email for root" root And this one also got to me.

There was an additional section about how to protect my domain from being used for spam - especially in this case, being impersonated. The article on setting up an SPF record doesn’t look too hard - if the service we are using to do DNS lets us set that up. I’ll have to look into it when we are switching DNS.

Configuring Email in ZenPhoto

Having the ability to get root mail is good - but the main reason I wanted email on this server was for ZenPhoto’s comment functionality. So now, on the plugin page of the ZenPhoto admin site, there is a Mail tab with two options. For now I chose zenphoto_sendmail which just uses the PHP mail facility to send mail using the local mail server.