Complete AWS site using Ansible

I started using Ansible a couple of years ago and a lot has changed since then. In order to refresh my knowledge, I decided to create a demonstration site (called “acme”) using Ansible, especially taking advantage of modules and new features. In order to make this worthwhile, I set a few goals for myself:

  • Create a complete site, including network topology, firewall rules, certificates, instance creation and configuration, plus the coordination between dependent resources, like web server and database. The rule was simple: no human intervention.
  • Multiple Environments. Any real site is going to need at least a dev and prod environment and you often find many more. My goal was to manage this from within a single repository, all under source control.
  • Security. Keep the secrets separate from the main configuration, but keep them under source control, not in a parallel system just to manage secrets.

So, enough talk, where is the code?

Check it out and let me know what you think!

Jenkins Build Cluster

Well, it has been another couple of years, but I’m still here!

I’m working at Pertino now, doing DevOps and mostly focused on builds. As such, running Jenkins and optimizing the use of build slaves is part of my job.

A colleague just wrote a blog post about setting up a build server that will automatically create just-in-time build slaves. It’s just part 1 right now, but if you are interested, it’s definitly worth taking a look at. The very cool thing is that it’s all done over a secure network connection using our product, Pertino.

Check out the blog: http://pertino.com/blog/secure-cloud-bursting-part-1/

picoprojector.org – Take 2

It has been a long time since I posted; over a year. My last post was about acquiring the site picoprojector.org, so I’ll start there.

My goal with the pico projector site was to learn about Google AdWords, Google Analytics and online advertising revenue. It was a great success, in that I did learn quite a bit. If you cause someone to click on an ad on your site and they buy something, you can earn a lot of money!

Some quick lessons: if you have a site with a Google PageRank of anything > 0, be careful. If you change the URLs, for example, you will start getting 404’s and your PageRank will drop instantly. To get it back will take a long time (months, maybe).

Another is that I finally understand the idea of selling something before you have a product. Once I found out how to analyze keywords from Google, I was able to analyze the keywords for the product I was building. It came as quite a shock, but nobody on the Internet seemed interested. If nobody is searching for your “thing”, then no matter how good your product might be, there is just no way to get the word out.

It was also nice to confirm things I already know: Google is life. Bing? The others? Rounding errors. I’m not talking about quality or my options or anything, just the number of hits you get and where they come from. If you do something to make Google unhappy, you might as well not exist.

So, I did that for a while, learned a lot, but ultimately it isn’t something I could continue. I tried to sell the site and got some nice offers, but it’s complicated to transfer a site and I just didn’t have the time. In the end, I just dropped the domain name(s) and hosting accounts. I noticed someone else already has a site there, which is not surprising at all.

I had been working on my startup for a long time, but finally decided to get a “real” job. It was as a manager of the group that does builds and other technical services for an Engineering team. The company, Plastic Logic, was pretty cool and I met a lot of fantastic people. Unfortunately, the company decided to license the technology it had instead of making products, so they closed down all operations in the US.

I had so many projects going that it is hard to stop working on them–and I haven’t! I’ll go into more detail later, if I can start to post on a more regular basis.

 

 

 

picoprojector.org

I’m the proud new owner of the website picoprojector.org!

I found it for sale on a site called flippa.com. Since I’m into technology and was also interested in learning more about WordPress and how Google AdSense works, this is a perfect fit for me.

I’m excited to add to the site by reviewing “pico” projectors and writing about them. Check it out!

Development -> Production

I’ve been working very hard to get my site ready for release. The first version isn’t much, just a self-hosted version of the static site I have on SquareSpace right now, but with a working “contact” button. So, it’s been a while–why isn’t it done?

It seems that I’ve had many ideas on how to manage development, testing and production code and much of my code has some concept of this. But, not all. And, getting from most to all and getting it consistent is taking much longer than I anticipated.

My code is all on GitHub and I’ve really fallen in love with using “git push” and “git pull” on all my various machines (servers, laptop, home computer). I’m always in sync, have multiple backups and it is very quick. But, I’ve also been keeping all the site configuration data in the same place. So, now, if I make some setting changes for production, they appear everywhere. Not good!

I’ve figured out ways around most of this. I’m still keeping the configuration information in the repository, but I differentiate based on hostname. There really aren’t that many configuration files, so I have a mapping files that says where they all go, then I (am about to) write a script that goes through and puts links in the right places back to these files. This way, I have a 100% up-to-date picture of my configuration on every machine stored in the repository.

The next piece I’ve tested out, but not used for real yet is Amazon’s CloudFront. I create my bucket, turn on CloudFont, make a CNAME so the URL looks nice, inject this as a “prefix” to all of the image/css/js paths in my code and I’m done. Well, not quite.

First, I didn’t set the permissions explicitly, so everything was private. Then, I forgot the mime type, so everything was binary. Finally, I noticed that I’d missed adding the prefix in a few places. Nothing huge or that I haven’t dealt with before, but it is hard to test because only unique file propagate quickly. I guess I’ll have to wait until tomorrow to try again!

oDesk

I’ve been using oDesk quite a bit recently and am having great luck. I’ve hired a few people and am able to get them productive right away. And, if I can’t, I can simply end the contract.

Having the ability to limit hours makes the decision much less of a risk. And, having oDesk handle all the billing, etc. is great.

Being able to look at screenshots from people doing work for you is a little creepy. But, still interesting! And, I guess the people working are fully aware of this and can even label the screenshots. It actually helped me a couple of times, because I can see what kind of system they are using and don’t have to waste time giving multiple instructions (Windows? Linux? Mac?).

The tests seem to be a pretty good indicator. From what I’ve heard/read, they are tough. And, my experience has tracked pretty well against how well they do on the tests. Obviously, this wouldn’t be too hard to cheat on, and so will never be iron-clad. But, it’s nice to have *something* to go on.

The web interface is not great. It looks nice, but there are some usability challenges. The separation of buyers and providers needs to be greater, I think. Also, it takes way too many screens for me to figure out what everyone is doing. Maybe this would improve if I was doing multiple projects at once or had more people; I can’t tell yet.

Did oDesk pay me to write this? Do they give me a discount? No. If they did, would I take it? Yes. I’m trying to bootstrap a startup, after all. To the FTC and everyone involved in trying to marginalize bloggers with laws to “protect” the public: (well, I probably should use words like that; they might be illegal too).

Cookies

I just started looking at using cookies with my site. Since the entire API is REST-based, there are (and will be) no cookies. For the web tool, though, keeping state and sessions will be very important.

Since I like clean URLs, I can’t send SESSION ID’s as query variables. And, I really don’t want to always use POST, although I’m not sure there is any technical reason why this wouldn’t work.

So, I’m working out the details of how to use cookies for login, sessions and preferences. I found an old post talking about cookie best practices. So far, I’m liking this advice and modeling my system after it.

So, I’ll have a login cookie that will be used if you check “Remember Me.” I will also have a session cookie to use while you are actively using the site. And, a preferences cooking that will remember specifics about the computer/browser you are using (as opposed to your account). I figure the preferences might come in handy if you use Mac and PC, or use one system for presentations or some other different use case. We’ll see if that actually makes sense in practice.

I’ve always been worried about performance, since the cookies will always come with each request. But, I figure I might use a login sub-domain to keep that one under control. And, I *need* the session cookie, so I’ll just make it short.

I’ll post more as I learn more.

The BackBlaze Pod

I love doing “fantasy” configurations for computers. Sometimes for home, but usually for work. High-perforanc servers, HPC systems, storage solutions; you name it.

So, it was really fun to read about the BackBlaze Pod. They went all out: a custom case, super-high-density and super-low cost.

They paid really, really close attention to detail:

  • The boot drive is a $38 80GB Parallel ATA drive. Where do you even find a drive this small?
  • Dual-Core CPU. With Quad core so cheap, it is so hard not just spend the extra few dollars and double your processing power.
  • Hard drives hanging off of an SATA card connected to PCI, not PCIe (they have those, too).
  • Consumer-grade power supplies, motherboard and hard drives.

Here is a collection of articles, starting with the orignal blog post:

BackBlaze Pod

Hacker News discussion of the original BackBlaze article

Sun Engineer Comments

Hacker News discussion of the Sun Engineer’s comments
StorageMojo’s take – with comments from BackBlaze

It’s hard to focus on work when I could be building one of my own!

SQL, JSON and embedded JSON

No answers today, only questions.

My application is using JSON objects to store data. Initially I didn’t like them because they felt kind of sloppy; you didn’t know if the things inside were defined, were arrays, hashes, functions or what. Now, I really like this flexibility (and, yes, I know JS doesn’t have a hash).

So, how do I store this complex object in my database? So far, it hasn’t been much of an issue. The object is a list, the list has values in JSON and so the object is a table in SQL and the values are records. No big deal.

But, now I have more complex objects. They are essentially tables within tables within tables. So, the translation code can get extensive. And, I have way too many joins.

One thing I tried is to Base64 encode the JSON object and just stick it in the database. This actually works pretty well, except that I can’t get to the data with a simple SQL query. I first have to get the blob, decode it, convert it into something useable and then do processing on that. Conceptually, no so hard, but a huge pain when much of your data is hidden away like this.

My answer for now is to simplify my application so I don’t have this problem. But, it isn’t gone, just deferred. If anyone has advice, I would love to hear it.

mod_perl

I’ve been using mod_perl for quite some time, but never to its full potential. It is difficult for me to dig through all the Apache 1.3, CGI and mod_perl 1.0 crap to learn what mod_perl 2.0 actually does.

Well, it was worth it! I knew things would be faster, but wow! It is so much faster. I can now do a curl-based web request faster than I can run a simple local Perl script.

There were a few bumps and I still have a few things to transition. I’ve always been very careful with global variables and other typical gotchas and, fortunately, that work seems to have paid off.

One of the biggest issues I had was that the “postdata” would disappear. I did some investigation and found that this only happened with a POST request. If I did the same, exact request using PUT, the “postdata” came through just fine.

I found posts referencing this probably back to 2003, but unfortunately, none of the solutions worked for me.

It turned out that I have a line that says:
$my_params = Vars($query_string);

After that, the postdata is gone!

So, all I had to do was capture the postdata first. I probably shouldn’t be using Vars anyway now, but I’m only solving one issue at a time!