WHERE WE’RE GOING, WE DON’T NEED ROADS… BUT WE WILL NEED FLAGS

In the previous blog post I introduced our stripe-ctf-2-vm, a self-contained capture the flag puzzle ladder in one vm. In this post, I’d like to talk about how we used the vm to introduce the security mindset to our developers here at Bazaarvoice.

One of the tenets in R&D is to responsibly “fail fast, win fast” with data-driven decisions. It was time for a grand experiment. We put on our lab coats and called for a dozen volunteers for a test training session. The format of this session ran like this:

  • Each participant was a lone competitor.
  • Discussion was encouraged to promote collaboration.
  • Each puzzle was timeboxed based on the feel of the room. 
  • Each puzzle/timebox would conclude with a discussion of the solution.  

sshot-104

What We Learned, Part 1

Gamifying learning is hard. There’s a whole field dedicated to this for a reason. We felt like we hit the broad strokes by providing: an increasingly difficult ladder of challenges, competition, social interaction (almost, see below), and timely feedback. The experiment led us to some very important tweaks…

  • Lone competitors don’t collaborate effectively, for obvious reasons. Attendees should form small teams of three or four for a better social element. Consider mixing technical skill sets and/or functional teams. That really helps.
  • There’s a balance to be had between freedom to play with a puzzle and frustration. It’s a good idea to track the time closely and cut to the solution before tables are flipped.
  • You also need to track the time so you can make an educated guess on how long future training sessions should be.
  • Proctoring and encouragement is needed for those new to CTF or security puzzles. This is easier to do when small teams are formed.
  • Prizes rewarded per level don’t motivate very much.

Taking It to the Big Leagues

Armed with what we learned and bolstered by testimonials from some of the attendees, we proposed to management a way to take this further. Now, BV R&D management is very pro-learning and our VP takes security very seriously. We didn’t have to sell the Why very much, just the How. Since this education was seen as vital to the growth of our engineering effort, we made the case that all one hundred and twenty plus of our engineers needed to train. We provided a plan for about nine sessions with about twenty engineers each. Once we had management buy-in, we did a two minute elevator-pitch to all of engineering so they had context for the meeting invites that followed.

If you don’t have rapport with your management about the value of secure code, you have your work cut out for you. We suggest you leverage your own talent to locate a few security flaws either through code review, pen testing, or fuzzing tools if you’ve got the skills. The likely low-hanging fruit for a web app exploit are unsanitized inputs, cross-site scripting, and SQL injection. Put together a presentation that educates on the rising number of web app breaches, the cost of fixing flaws in production, and the fact that you just found flaws in production. Paying for external pen testing or security training, or security tools can be expensive. Use what leverage you get from the presentation to instigate a bottom-up approach like a book club or training like this CTF session to cultivate security mavens who can advise the engineers around them.

tumblr_l9ofytzjp61qb7unno1_500

The Pattern for a Session

Once you have roomful of engineers ready get their security on, what do you do? We divided them up into small teams and tried to get a mix between front-end, back-end, and QA skill sets when possible. The next thirty or so minutes were spent on helping the participants setup their VMs and log into them via SSH. Most of the issues you will run into will likely stem from mix-ups with localhost VBox adapters or trying to run and log into the VM directly from the VBox UI rather than via their terminal and shell script. When your engineers are logged into the VM, you’re ready to run through the introductory slides.

As soon as fingers are poised to take on the puzzles, give everyone the password and present the staging slide for Puzzle 0 (0, because we are programmers). Set up the puzzle and let the teams go to town on it. It’s a good idea to remind everyone to be vocal about looting the puzzle. And make sure they know they can look at the source code in the VM or in the github repo (no, the code you can see in the levels/0 directory is not the running instance of it so don’t modify it and expect to see the changes). A few minutes later, show the hint slide. Feel out your audience though. Easier puzzles may not need a hint, harder puzzles may need more breathing room before or after the hint is shown.

After you have a winner, or better yet, wait until you have two victorious teams (you don’t want to move too fast), show the solution and remediation slides and talk through ways the vulnerability could have been avoided. Any examples specific to your stack or technology you can use to bolster the remediation discussion will strengthen the discussion enormously. Once all the teams have unlocked the next level, repeat until you run out of time or until you solve all the puzzles. The latter means you are a bunch of badasses. Srsly!

maxresdefault

Go as far as you can, and reward the team that conquered the most puzzles with something highly coveted: Pokemon figures. Or money. Whichever.

Your mileage will very of course, but we found the following timing worked for BV R&D across our three hour training sessions:

ctf-timings

What We Learned, Part 2

Here’s what we learned from conducting nine sessions:

  • Scheduling 120+ people for a three hour training session is hard. Set aside admin time to handle rescheduling requests and plan for a make-up session.
  • Generate a VM per session so the loot is different each time.
  • For our engineers, a three hour session allowed the class to work through the first 5 or 6 puzzles as a group which made for a good introduction to security vulnerabilities.
  • Send out instructions and links to your generated VMs in the meeting invite.
  • Even so, plan on spending the first thirty to forty minutes setting up VMs.
  • Don’t send out the ctf user password in the invite; some clever, motivated individual will work through all the problems before the session. 🙂  If they do that, they should proctor the session with you.
  • If your company uses HipChat, Slack, or similar, set up an invite-only room per session for that session’s attendees only. It helps to have a means of cut/pasting the solutions to the group.
  • Don’t forget to work the room to give encouragement and redirect rabbit holing. Each team is different and some prefer to be left alone. Learn to feel that and respect it.
  • Some engineers will get hung up on the implementation details of the puzzles: “I don’t need to learn bloody PHP.” Try to impress upon them the larger vulnerability exercised by the puzzle. Try leading them through the thought exercise of how the security flaw (and not the language-specific flaw) may be at work in your company’s code today.
  • Some teams will race ahead after unlocking the next level. Most of the time that will even out as the CTF continues. Occasionally you have an ace security-minded team. If they continually blaze ahead, you may need to get their buy-in to take them out of direct competition with the other teams for morale’s sake. Perhaps they can help proctor the others?
  • Use something simple like a GoogleDoc form to take a quick poll of participants after each session. You will get invaluable feedback on how to better customize this to your organization.

We hope you find this unique way of introducing the secure coding mindset to your engineering organization useful and fun. It is just the tip of the iceberg. Once your engineers are thinking this way, you will need to develop further plans to encourage more education on the common secure coding patterns themselves. Consider periodic secure coding code reviews or pair programming with the experienced security-minded folks on your team. Forming teams to compete in other CTF competitions and internal security bug bounties might also help. Please don’t hesitate to contact us if you want tips on how to implement this for yourself or if you just want to let us know how it worked for you.

Ah, But Do You Have a Flag?

Hey you there, did you know that forty percent of all data breaches are due to web application vulnerabilities? That means the very software your team is building is likely to be the vector to getting your data pwnd. Still feeling skeptical? You should google Heartland’s 2008 breach, eBay’s XSS vulnerability, or Time Warner’s password leak. I’ll wait.

Done? Pretty scary, isn’t it? 

Great, But How Do You Get Your Developers Thinking About Security?

The discipline of bullet-proofing your code against application vulnerabilities is called Secure Coding. You want this fancy secure coding to up your AppSec game, but what if your R&D organization lacks the skills? You hired smart people, they can learn it, but they need to want to.  They need to feel it. So how do you get your team stoked on security awareness (besides telling them to stop writing their passwords on post-it notes)?

The first thing you do is put together a rip-roaring slide deck with the top ten security flaws and a snazzy background and get them to read the heck out of it.  Developers love slide decks.

Hmm, That didn’t work.

If only there were a better way, more engaging way.  And there is. Did you learn to code just by reading about Java? No way. You started working on coding examples to get the hang of it, right?  Maybe you even got your code katas or koans on so you could motivate yourself. Why not do the same to cultivate some security awareness love?

0p7ejqm

AppSec enthusiasts commonly compete in Capture The Flag contests. No, not this. Not this either. There are a couple of CTF formats out there, but the Jeopardy format is the one that best suits the needs of introductory training. This format is made up of a ladder of increasingly difficult puzzles. The ladder works like this:

  • Look at the puzzle, in this case a flawed web application. Since we’re interested in secure coding, look at the source code the app.
  • Throw what you can against it.
  • If you succeed in exploiting a flaw in the app, you should get it to cough up the key to the next level. That’s called the loot.
  • Use said loot to unlock the next level.
  • Lather. Rinse. Repeat until you are the first one to loot the final level.
  • Stand on table and celebrate your victory.

You might be thinking, “This is great advice but how do I get me one of those CTF contests?” We thought the same thing. We didn’t have time to wait for someone else to put together a competition, and we wanted to make inroads on secure coding training in a more controlled environment. What to do?

What We Did

Some of us had competed in the Stripe 2.0 CTF like, 37.5 computer years ago (that’s roughly 4 years ago in people years). Fortunately, the good people at Stripe open-sourced those very same web app puzzles. Yea! But they had languished untouched in the backwaters of github. Boo!

We needed…

moar

After some studious digital archeology in the form of ancient version management, we resuscitated the puzzles. Once we had the puzzles in hand, we used veewee to roll a VirtualBox (VBox) compatible VM with some scripting magic to auto-generate the loot values. In this VM, each puzzle was set up to run sandboxed away from the casual user, but still gave them access to the source code.

How Can You Do the Same?

yougetavm

If you’ve read this far, you might be ready to introduce some CTF-based training to your organization. You might still be thinking, “This is great and all, but how do I get me one of those CTFs?”  Scroll no further, true believer.  We have open-sourced all the material you need to conduct your own CTF training right here in this very github:

https://github.com/bazaarvoice/stripe-ctf-2-vm

The instructions to roll the VMs can be found here.  The slide deck needed for the training sessions can be found here. It’s like today is your birthday!

In the next post, I will explain how we developed an introductory secure coding training session around this vm and provide advice on how you can do it too. Now, go get that flag!

Three Takeaways from CSSConf 2016

This year Bazaarvoice sponsored CSSConf 2016 in beautiful Boston, MA, USA and I was able to attend!
Here are my three top takeaways from CSSConf 2016:

Flexy Flexy Flexbox

A little over a year ago, our application team wasn’t sure how “stable” Flexbox or its spec were: there was already an old syntax, a new syntax, and a weird IE10 “tweener” syntax.

The layout advantages Flexbox brought were strong enough (*cough* vertical centering) that we decided to move forward with it and prefix all the things. Now browser support is so good that if you can drop IE8 and work around some known IE11 bugs, there is no reason not to use Flexbox in your designs right now.

A great reference I keep going back to for Flexbox is this css-tricks guide. Here are some other tips and tricks from the conference:

  • Flexbox is now available in Bootstrap 4
  • Use CSS Grid (when it becomes available) for major page layout, and Flexbox for UI elements
  • For mobile / small screens: add a media query and set the flex-direciton to column to stack your cells instead
  • Do as much as you can on the container to keep your code DRY (Don’t Repeat Yourself)
  • We can finally get rid of that pesky col-2, col-8, col-crapIAddedWrong grid system!

For more information, I recommend CSS4 Grid: True Layout Finally Arrives by Jen Kramer and It’s Time To Ditch The Grid System by Emily Hayman.

Stop Thinking In Pixels

The talk Stop Thinking In Pixels by Keith Grant was particularly enlightening.
The basic premise was not to micromanage your CSS:

Without fully understanding what CSS is doing for us, we try to push through it to control exactly what is going on in the browser.

Driving this point home, Keith recommended to stop thinking in pixels because

The pixels don’t matter. Let the browser do it.

You should instead be thinking in terms of the em and rem. Tools that simply convert px to em aren’t the answer either — you’re still thinking in terms of pixels and therefore missing out on some important benefits. Instead, learn from something like type scale and approach measurements with a fresh perspective.

I recommend watching the talk in full, but a quick cheatsheet follows:

Property Recommended Unit
font-size rem
padding, margin, border-radius, etc. em
border-width px

When in doubt, use em.

To summarize,

Ems are the most powerful when you fully embrace them.

Apps vs Documents

In this day and age we are all used to thinking in terms of “apps”. But the trinity of HTML, CSS, and JS was not conceived in this day and age. Two great quotes I wrote down from Component-Based Style Reuse by Pete Hunt are

CSS is great for documents, maybe not 2016 Apps

and

If you sat down and created styling in 2016, you would not come up with CSS

Our newest applications are written in React, which encourages developers to think of things in terms of components — pieces of UI that are reusable in different contexts. The Cascading part of CSS interferes with that, however: depending on the context your component is dropped into, it may look drastically different across usages. When that is not what you want, Pete’s ideas center around reusing components, not CSS classes.

As you can imagine, this idea is largely controversial in a conference with a name like CSSConf, but I will continue to keep my eye on it. Pete’s thought leadership on this topic inspires me to challenge norms and dare to envision things differently. After all, if we’re constantly fighting with our tool (CSS), that tool may not be right for the job.

Thanks for reading! For a full list of talks and slides from the conference, check out https://2016.cssconf.com/#videos.

The one thing you need that will make you and your team successful

Over the last 20 years of my career, I have worked with a lot of different people and lot of different teams. Some were very successful, and some were not. I am always trying to understand what makes successful people tick, and what I can do differently to be more successful.

Your Energy

The one thing that I have found that consistently is the determining factor as to whether a person or team will be successful is their energy. Energy takes a lot of different forms and states. There is high energy and low energy, positive energy and negative energy. Everything in the universe is energy. Everything has a force and a pull and a gravity. Every person has an energy. Some people call it a “spark”, or “Spirit”, or “vibration”. If you pay close enough attention, you can see it, and sometimes you can even feel it. Have you ever walked into a room where a group of people had been discussing something serious, and you can feel the negative energy in the room. Have you ever worked with a very successful charismatic leader who just seemed to attract winners? Why is that? What is that?

The most successful people and teams that I have encountered or had the privilege to work with have very positive and high energy. Everyone on the team is excited and passionate to be working there. They love what they do, they love working with the people on their team, they know they are going to win, they know they are making a positive difference in the world, and they have fun winning.

That kind of positive high energy feeds and builds on itself. The more positive high energy people you get together, the better the team will be. It’s like waves in an ocean oscillating at the same frequency. They multiply the goodness. These are the people who see the future, and the solutions and the answers and choose not to focus on the past or the problems.

How does your energy affect others around you? How does other’s energy affect you? How does your boss’s energy affect you?

Emotional Contagion

The opposite of positive high energy, is negative low energy. It manifests itself in people who fixate on the past and the problems. They are the people who think that it can’t be done. They are the “We’ve tried that before and it didn’t work” people. They are the naysayers. They typically start all new requests for change with “No!”. They are always bickering or blaming others or finger pointing or micromanaging.

There are many possible reasons for a person’s negativity. It could be their FUD (Fear Uncertainty and Doubt) about the situation. It could be their fear of failure, or their fear of embarrassment, or their fear of looking stupid in front of their peers, or their insecurity in their position.

The truth is that your emotional state is affected by those around you. Just like positive energy builds on itself, so does negative energy. It’s like a rotten apple, or a cancer. It grows and infects others who are near it. I have seen one negative energy person bring down an entire team.

And as with any cancer, it has to be cut out quickly. It’s easy to say, but hard to do. I have seen too many good managers who I respected wait way too long letting a negative energy person fester in the team.

It’s Up to You

The reality is that you have the power to choose how you exert your own energy. You have the power to choose who you want to work with. Not everyone is positive high energy all the time. We all have our good days and bad days, but you can choose to be passionate and excited and have a high positive energy, or you can be the opposite.

So how can you get started growing your positive high energy?

  • Acknowledge that we aren’t perfect, but we can try to make things better.
  • Ignore the naysayers
  • Don’t fixate on what other people think (or what you think they think about you)
  • Do what you think is right
  • Stay in the present, and dream of the future. You can’t live in the past.
  • Acknowledge it, learn from it, and move on
  • Imagine an amazing future. What would that look like? Feel like?
  • Exercise and eat healthy
  • Use positive words. Say Thank You.
  • Smile and laugh
  • Play games and have fun.

Next Up

Tell me what part of our story you want to hear next. How do you build a team and culture that enables you to execute on your vision? Follow me on twitter @bchagoly and @bazaarvoicedev to be the first to read new related posts and to join the conversation.

Intern Demo Day

As the summer comes to an end, so do the internships for numerous university students here at Bazaarvoice. This past week, the interns were given an opportunity to present a summary of their accomplishments. This afternoon of presentations, known as the Bazaarvoice “Intern Demo Day”, highlighted the various achievements throughout the company, not just in the R&D department.

The following is a short summary of the great work our interns complete this summer as well as some images from the “Intern Demo Day”.

CHASE PORTER: My project, which I have named “The Great (red)Shift”, is intended to improve data accessibility for computed aggregated counts of various canonical events written to HBase. To do this I designed a data warehouse in Amazon Redshift that I loaded with transformed aggregated counts extracted from the tables in HBase. This makes the counts readily SQL query-able in an incredibly fast system whereas before they had to be computed with performance heavy queries from Raw Logs generated by Cookie Monster. The biggest block for this project was in processing the data from HBase which was stored as serialized bytes and needed to be handled uniquely for different types of canonical events (i.e. pageviews, impressions, features) to translate into a readable form for Redshift.

BEN DEVORE: My product is web crawler written in node.js that scrapes clients’ webpages for product data in order to build their product feeds for them. For many of Bazaarvoice’s smaller clients, building and maintaining their product feed is a significant obstacle in the onboarding process. This tool aims to clear that obstacle by taking this task out of there hands.

STONEY MCCRARY: So I have been fortunate enough to get to work on several different pieces in curations but I am going to talk on what I have been hammering on for the last couple of weeks. More and more of our high volume clients are receiving millions of hits a day and this has caused performance to become a higher priority problem for them. In response to this, we are focusing our efforts on building a new display with performance in mind. Performance for the display centers around only providing the minimal amount of data needed and supply the rest as necessary. The piece I will be showing is the display carousel and how it dynamically loads and dumps the data to allow for faster loading and to keep browser memory low.

ZESHAN ANWAR: Eagle is a dashboard built for our Incubator team. With so many moving parts, it was important we had a summarized ‘birds-eye’ view of the team in one place. Eagle was initially meant to be an aggregation of all our Jenkin builds; a single view of all our jobs across our different Jenkins environments. However, it grew to also include JIRA and GitHub statistics. My other project was optimizing our UI tests by having them run concurrently. Our old UI tests were extremely slow, and by running them in parallel we drastically reduced test times.

ZESHAN_Eagle(1)

ZESHAN_Eagle(3)

BRENDON KELLEY: Testing Framework: This summer my project was to help build out a new testing framework for Curations. The current automation tests used for Curations is Saladhands. Before my internship, there wasn’t much if any automation tests for the submission/direct upload capability of Curations. I worked on creating tests and a CI environment for submission in a new testing framework called Intern. One of tests includes a language translation test using mongoDB as an endpoint to store the various languages’ text. Intern is a javascript based testing framework which will allow developers to contribute to writing tests since Curations is mostly javascript. I’ve also worked on updating and creating new console tests in this framework. The foundation built this summer in Intern will enable the ability to further contribute to the framework.

KRYSTINA DIAO: My main project for the summer was to analyze and report the effectiveness of the implementation of the new Connections Knowledgebase. Through Salesforce, I collected and analyzed the number of cases, time spent on each case, etc. After drawing my conclusions, I decided to present my findings via data visualization methods (JavaScript’s C3 and D3 libraries) and provide actionable insights on how this information can be leveraged. This information is valuable in that it can be used for future product KB decisions, as well as understanding how much time, manpower, and money is saved by having a KB.

Krystina Diaos (3)

Krystina Diaos (2)

Krystina Diaos

MARKO SAVIC: Over the summer, I was a part of the SEO Team. I managed to create a tool on pagesManaged and keywordsManaged feed for every Spotlights client. Generated feeds will be consumed by SeoClarity tools on a daily basis. This helps in identifying search rank gains on the specified keywords and pages where Spotlights are present. The SeoClarity reporting will help in proving out Spotlights value and eventually lead to Spotlights renewal/upticks.Also, I created algorithm tweaks on the PINS (Post Interaction Notification System) Generator that take into account product activeness, product relevancy and review count, and use them to ask the user to write reviews on the most relevant products.

TREVOR NELLIGAN: Here is a description of my project: I worked on the Aperture Component library and many of the projects it supports. Aperture is build in React, and its purpose is to be used as an internal Bazaarvoice tool for constructing web pages. Using Aperture, anyone at Bazaarvoice can easily create a functional, intuitive, Bazaarvoice themed webpage, all with the building blocks Aperture provides.

Using the Aperture library, I helped the construction of numerous pages for the curations beta console. I personally built the interface for a new client-facing template builder, which will allow clients to create curations templates quickly and easily without having to go through an implementation engineer and a long process, as was the case previously. I also supplied custom Aperture components for several projects, like the content curation beta page.

RAMIE RAUFDEEN: The mixer is a component of our product recommendations engine which differentiates shoppers, and optimizes recommendations for them. This is primarily derived from their shopping behavior – in real time. Prior to the mixer, product recommendations were aggregated from multiple sources, using the same algorithm for every shopper. Shoppers are now categorized based off of a set of rules (using the shopper’s profile data), each of the rules map to a plan (which you can think of as an ‘algorithm’). A plan defines how recommendations should be mixed from each of the sources. For example, if source B has proven to have a higher conversion rate for ‘heavy-shoppers’, the plan for ‘heavy-shopper’ would give a higher weighting to source B. We can now target specific types of shoppers when it comes to product recommendations. This also sets the groundwork for a more granular machine learning implementation in the future.

Science4Science3Science2

We want to thank all the interns who spent time with us this summer and wish you the best back at school. We look forward to hearing about all the great things you all develop in the future.

If you are interested in an internship at Bazaarvoice, please contact kindall.heye@bazaarvoice.com.

Don’t Look Back in Anger: Retrospectives, Software Development and How Your Team Can Improve

Retrospective – This term can elicit a negative response in people in the software development industry (verbally and physically).  After all, it is a bit of a loaded term.  Looking back can be painful especially since that usually means looking back at mistakes, missteps and decisions we might want to take back.
retro-940x272

I have worked for over a decade in software development and information technology fields and during that time I’ve been involved in many meetings you could label as ‘retrospective’.  Some of these have been great, others terrible.  Some were over-long sessions of moaning and wailing resulting in a great deal of sound and fury, signifying little while others have been productive discussions that led to positive, foundational change in the way a team operated.

Looking back at all of that, I’ve realized a few common truths to the process of retrospectives and how it relates to building software:

  • They’re necessary – there’s always room for improvement.
  • Don’t focus on the negative – focus on the constructive (ie: “how we can improve”)
  • You need an actionable to-do-list – figure out what to change, quickly, then act on it

For the past five months, my team has been employing a process to facilitate retrospectives based around the three bullet points above to foment positive change in the way our team works.  This blog post will detail how we do this.  This is not to say, “you should perform retrospectives and you should do them this exact way”, because every software team works differently.  This however is to say, “here’s an example of how the retrospective process was done with success”.  Maybe it could work for you or give you ideas on how you can form your own process that works best for your team.

Step 1: Recognize!

Kermit_-_YOU_BETTA_RECOGNIZE_400x400

 

 

 

 

 

 

OK, let’s get the grisly stuff out of the way first:  Your team needs to improve.  And that is OK – because no team is perfect and no matter how well we execute on our goals, there’s always room to improve.  In fact, if your team is not at some regular interval assessing or reassessing how you design software, you’re probably ‘doing it wrong’ because (can you smell the theme I’m cooking up here yet?) – there’s always room to improve.  If you’re exploring or actively participating management concepts like Kanban/continuous improvement, this process is essential.

Step 2: Togetherness

OK – you want to improve your process and are ready to discuss how that can be done.  How does the rest of your team feel about that?  You can drag a horse to water and you can drag your team mates into a meeting but you can’t make either drink (unless you’re looking to rack up some workplace violations and future sensitivity training).

This is not how we define team work.

It’s important to get a consensus from the entire team in question before proceeding with a retrospective.  This will actually be easier than some might think.  After all – in software engineering, we’re all trying to solve complex problems.  This is just another problem that needs figuring out.

Before going further, you’re going to need a few things:

  • Somewhere to meet
  • Someone to take notes
  • Someone to keep time
  • Somewhere to post your meeting notes (Jira, Confluence page, post-it notes, whiteboard, whatever works for you)
  • Some time (30 minutes to 1 hour, depending on the size of your team)

Step 3: Good Vibrations

tumblr_mvvj0lM5LS1s65qnho1_400 












Come on, you know you loved this song back in the day.

The tough stuff is out of the way at this point.  You and your team have decided to fine tune your process and have met in a room together to do so (the 2 boxes of Shipley’s donuts you brought with you is barely a tangent in this case).

National-Donut-Day-6.7.14

 

 

 

 

 

 

 

When we started performing retrospectives regularly on the Shopper Marketing team, we started our discussion by talking about our recent successes as a team.

In this case, we as a team, one by one went around the room and listed one thing during the past period (sprint, release cycle, week, microfortnight, etc. – the interval doesn’t matter, whatever works for your team) and offered 1 thing we felt the team did well.  Doing this has a couple of functions:

  • It helps the team jog its memory (a lot can happen in a short amount of time).
  • It helps celebrate what the team has done recently (wallow in your success!).
  • It sets the right tone (before we focus on what needs to change, lets give a shout out to what we’re doing right).

Note that, especially with larger teams, sometimes one’s perception of how well something might have gone might be very different from someone else’s.  If Jane, your engineering head for your web service says she thought the release of the bulk processor was on time and under budget but Bill, your UI lead contends that it in fact wasn’t on time, wasn’t under budget and wiped out an entire village and made some kids cry, then this is the point where you should pause the retro and have a brief discussion about the matter and possibly a follow-up later.  This leads us to our next step…

69526590-1

















Nick Young asking the important questions of our time.

Step 4: What Could Be Better?

Now that you’re full of happy thoughts about what you’ve done well as a team (and full of donuts) its time to discuss change.  This process is the same as the previous step only this time, each team member must call out one thing they think the team could have done better.

homer-simpson














Mmmmmmm... Donuts...

Note that this isn’t about what was bad, or what sucked, or so-and-so is a doofus and has bad breath – but what can the team do better.  In order to keep the discussion productive and expedient, each person is limited to 1 item and that item has to be something the team has direct agency over.  You also might find it handy to nominate a moderator for this point of discussion.

Here’s an example of a real issue that was brought up in a retrospective at a previous company I worked for and the right and wrong way to frame it when it comes to focusing on ‘your team’s agency’.

Marketing and sales sold our client a bunch of features that our product doesn’t do and we had to work 90+ hours this week just to deliver it”

-Or –

“We should work and communicate closer with marketing and sales so they understand our product features as well as the effort required in developing said features”

While the first statement may be true it’s the second one that actually encapsulates the problem in a way the team has a manner of dealing with.  I found that focusing strongly on the positive aspects of a retrospective discussion helps some teams to self-align toward reaching toward the kind of perspective found in the latter statement above than the former.

Other teams I found realized the need to appoint a moderator to help keep the discussion focused.  Its important to figure out what sort of need your team has in this regard early in the retrospective process.  This might seem tough at first (and emotionally charged) and that’s OK.  What’s important is to keep a positive frame of mind to work through this discussion.

As previously stated, if there appears to be a major disconnect between team members regarding issues that could have been done better, this is a good time to discuss that and hopefully iron out any misunderstandings or make plans to do so soon after the retrospective.

Whoever is taking notes, I hope you’re getting all of this because we’re about to tie it all together.

Big_Lebowski_Rug_Blue_Shirt_POP

Step 5: Decide

By now, you and your team will have come up with a good deal of input on what the team did well and what could be improved.  You might be surprised but having done this numerous times, often a very clear pattern to what the team though went well and could have improved on appears and does so quickly.

As a team, go over the list of items to potentially improve on and note which ones are the strongest, common denominator.  Pick 1 or 2 of these and this will provide some focus for the team for the next period of work (sprint, cycle, dog year, etc.).  These can be process-oriented intangibles or even technical issues.  Here are some examples:

Our front end engineers were idle during much of the sprint waiting on services to be updated”

Or

We spend 1-2 hours doing process Y manually every week and we should probably automate it

The first item has to do with improving the team’s process of how it interacts with itself.  The other is a clear, intangible need that can be solved using a technical solution.

action-comics-1-superman-thumb-450x6071

















A high-valued issue.

Once the team has identified 1 or 2 high-valued issues they feel need to be addressed, its time to do just that.

Step 6: Act!

This is the most important step.  It’s one thing to identify a problem to be solved, another to actually act on it.  Hold a discussion as to what potential solutions to your issues might be.  The time keeper at this point should be keeping an eye on the clock to help the team keep the meeting moving forward in a productive manner.  At this step, try and keep the following in mind:

  • By now, whatever issues the team has identified that need to be addressed, it should be something the team has full agency to solve (the matter is firmly in the team’s hands)
  • Depending on the issue at hand, it may be something complex and the team might need a couple of tries to solve it. Don’t get discouraged if the issue appears gargantuan or if you can’t solve it within your next period of work alone.

tumblr_lsvmqpEpk21qdku5lo2_500

 

 

 

 

 

Once you’ve brainstormed on solutions to how you and your team can make improvements, it’s time to really act.  In this case, you should have some ideas that can be transformed into action items (you get a Jira ticket and you get a Jira ticket and…).  The retro should conclude with the team being able to break down a solution for improving into 1 or more workable tasks and then assigning them to team members for the next sprint, cycle or galactic year.

The key to making this all work is that last part.  What comes out of the retrospective should be treated as any other item of work your team normally commits to in the confines of your regular working period (sprint, release, Mayan lunar month, etc.) with one or more team members responsible for delivering the solution, whether technical or otherwise.

Now you’re ready to move forward with improving as a team and dive into that post-donut food coma.

DakIK



















Woooo! Food coma!

Step 7: Going Forward

Whether you use this process or some other framework to run a retrospective, repeating that process is very important.  To what degree or frequency your team does that is for the team to decide.  Some teams I’ve worked with performed retrospectives after every sprint, others only after major releases and some just whenever the team felt it was necessary.  The key is for the team to decide on that interval and after your initial retrospective, be sure to devote some time at the beginning of subsequent retrospectives to review your action items from the previous session (did you complete them and were the issues related to them resolved?).

Hopefully this post has given you an idea how your team can not only perform retrospectives but also improve your team’s working process and even look forward to looking back.

And on that note…

HackerX event hosted at Bazaarvoice

The Bazaarvoice headquarters hosted the July 20th HackerX event in Austin, Texas. The event featured not only Bazaarvoice, but also included Facebook, Amazon, and Indeed. 70+ engineers participated in onsite interviews and networking. HackerX commented that “this was one of the most successful events” they have ever seen.

Gary Allison, Executive Vice President of Engineering, kicked off the event with a compelling message about Bazaarvoice and why this is an awesome place to work.

HackerX started in 2012 with invite-only, face-to-face recruiting events that connect tech talent with some of the world’s most innovative companies. Currently, they operate over 100+ events in 40+ cities, 15+ countries annually.

See www.hackerx.org for additional information.

hackerx2

hackerx4

hackerx1

Injecting Applications onto Third-Party Webpages Made Easy

Bazaarvoice’s Small Web App Technologies (SWAT) team is pleased to announce that we are open sourcing swat-proxy – a tool to inject applications onto third-party webpages.

In third-party web application development it is difficult to be certain how our applications will look and behave on a client’s webpage until they are implemented. Any number of things could interfere – including other third-party applications! Delivering applications that don’t work can obviously have a severe negative impact on both our clients and us.

One solution is to inject – or proxy – our applications onto the client’s web page. This way we can ensure they work correctly – before they go into production. I wrote swat-proxy to do exactly that, acting as a man-in-the-middle between browser and web server. As the browser requests web pages from the server, swat-proxy intercepts the response and proxies our application into it. The browser renders the web page as if it contained our application all along – exactly simulating the client having implemented it. Now we can be certain how our applications will look and behave.

Other tools exist to accomplish this task, but none are as front-end developer-friendly as swat-proxy: it is written entirely in Javascript – plugging in nicely to our existing workflows – and uses familiar CSS selectors to target DOM elements when injecting content. It is run locally using NodeJS and is very easy to use.

We have found swat-proxy to be incredibly useful when rapidly iterating on prototypes and ensuring the behavior of our applications before they are released to production – we hope you do too! We are releasing it to the larger world as open source, under the Apache 2.0 license. Please download it, try it out, and let us know what you think (in comments below, or as issues or pull requests on Github).

Why we do weekly demos

If you are part of an agile, or lean, or kanban development team, you probably do or have done demos at one point. Some people call them “end of sprint” demos. Some people call them “stakeholder” demos. We are pretty informal and irreverent about it at Bazaarvoice, and we just call them “demos” because giving them too formal of a name, or process will defeat the purpose. Demos are an amazingly valuable part of the development process, and I highly recommend that your team start doing them weekly.

Why not to do demos

There are a lot of reasons why not to do demos. If you are doing demos because of any (or all) of these reasons, you are probably doing it wrong. (Oh, did I mention that this is my personal opinionated viewpoint?)

  • We are forced to do demos so that management can keep an eye on us.
  • We do demos so management can make sure that each person is actually doing work.
  • We do polished demos for our sales and marketing team and they have to be perfect.
  • I don’t really know why we do demos, but my boss told me to, so I just show up.

Prep Work

Before you can do effective demos that help accelerate your development, your delivery, your quality and your culture, you need to set the stage for success. So how do you do that?

At Bazaarvoice, we believe in hiring smart, passionate owners that we can trust to do the right thing for the company, for our customers, and most importantly for our consumers. If everyone is bought into the vision and mission that you are all working to accomplish together, and if everyone is engaged and proactively helping to find creative solutions to those problems, then it is a lot easier to have productive weekly demos. You want to have an environment where everyone can speak without judgment, and where they know that their random idea will be heard and appreciated and not shot down. If everyone is coming from this place of openness, transparency and trust, then you can get some great demos.

Why

We believe that demos are part of the creative, collaborative and iterative process of developing amazing software solutions. The point of the demo is it be a time when everyone on the team (and stakeholders) can get together and celebrate incremental progress. It’s a way to bring out all the current work out into the light and to discuss it. What’s good, what still needs work, what did we learn, what do we need to change, and how could we make it even better. It lets us course correct earlier before we fly into the mountain. And It gives others awareness of what’s going on and brings the “aha” moments.

It’s about learning, and questioning, and celebrating. Everyone who demos gets applause. All work that gets done is important and should be celebrated. All work and everyone means Engineers, QA, UX, Product Managers, Marketing, Documentation, Sales, Management, and anyone who completed something that helped move the cause forward.

When

I like to do demos at 4pm on Fridays. Ouch right?! Well that’s kind of the point. Doesn’t everyone just want to check out early for the weekend? Nope… Well let’s hope not. If people are leaving early and complaining, then pause and go back and re-read the “Prep Work” section.

We actually use that time because it’s the end of the week. It gives us the most time during the week to get things done, and it gives us free space after the meeting if we need to run long for a brainstorming session or to discuss things more deeply. We have 14 engineers/UX/QA on our team, and we schedule demos for 30 minutes. If we finish early, or not, we usually stick around for beers and games anyway until 6-7pm because we just like hanging out with each other. Imagine that.

So isn’t that great? You get to hang out with your friends, show off your accomplishments, and leave work for the week feeling pride in what was accomplished both personally and by the entire team. You are hopefully excited about the future, and probably brainstorming new ideas over the weekend from what you learned from the demos.

Planning

We do planning for our week on Monday mornings. That part is important too and it’s the ying to the demo’s yang. People get into the office on Monday morning and are fresh and ready to go and hungry to know what they can pick up next, so it’s the perfect time to set the stage for the week. What are our top priorities for the week? What do we absolutely have to get done by the end of the week (aka by demos on Friday afternoon)?

Planning is a great venue to ensure everyone is in sync with the vision and Product priorities. Product shares upcoming business epics, UX walks us through new mockups and user findings, Engineering discusses new platform capabilities that we should consider, and QA reminds us of areas we need to harder. It helps form this continuous cycle of planning and validation. It’s a bit odd, because we are very Kanban and flow oriented, but we have found that having some timeboxes around planning and demo’ing gives the team a sense of closure and accomplishment. It’s important to occasionally step back and assess the progress. We have found this sets a really good and sustainable cadence for a productive team.

Next Up

Tell me what part of our story you want to hear next. How do you build a team and culture that enables you to execute on your vision? Follow me on twitter @bchagoly and @bazaarvoicedev to be the first to read new related posts and to join the conversation.

How to seamlessly move 300 Million shoppers to a highly scalable architecture, part 2

Divide and Conquer

As Engineers, we often like nice clean solutions that don’t carry along what we like to call technical debt.  Technical debt literally is stuff that we have to go back to fix/rewrite later or that requires significant ongoing maintenance effort.  In a perfect world, we fire up the the new platform and move all the traffic over.  If you find that perfect world, please send an Uber for me. Add to this the scale of traffic we serve at Bazaarvoice, and it’s obvious it would take time to harden the new system.

The secret to how we pulled this off lies in the architecture choices to break apart the challenge into two parts: frontend and backend.  While we reengineered the front-end into the the new javascript solution, there were still thousands of customers using the template-based front end.  So, we took the original server side rendering code and turned it into a service talking to our new Polloi service.  This enabled us to handle request from client sites exactly like the Classic original system.

Also, we created a service improved upon the original API but was compatible from a specific version forward.  We chose to not try to be compatible for all version for all time, as all APIs go through evolution and deprecation.  We naturally chose the version that was compatible with the new Javascript front end.  With these choices made, we could independently decide when and how to move clients to the new backend architecture irrespective of the front-end service they were using.

A simplified view of this architecture looks like this:

Divide_Conquer_arch

With the above in place, we can switch a Javascript client to use the new version of the API through just changing the endpoint of the API key.  For a template-based client, we can change the endpoint to the new referring service through a configuration in our CDN Akamai.

Testing for compatibility is a lot of work, though not particularly difficult. API compatibility is pretty straight forward, which testing whether a template page renders correctly is a little more involved especially since those pages can be highly customized.  We found the most effective way to accomplish the later since it was a one time event was with manual inspection to be sure that the pages rendered exactly the same on our QA clusters as they did in the production classic system.

Success we found early on was based on moving cohorts of customers together to the new system. At first we would move a few at a time, making absolutely sure the pages rendered correctly, monitoring system performance, and looking for any anomalies.  If we saw a problem, we could move them back quickly through reversing the change in Akamai. At first much of this was also manual, so in parallel, we had to build up tooling to handle the switching of customers, which even included working with Akamai to enhance their API so we could automate changes in the CDN.

From moving a few clients at a time, we progressed to moving over 10s of clients at a time. Through a tremendous engineering effort, in parallel we improved the scalability of our ElasticSearch clusters and other systems which allowed us to move 100s of clients at a time, then 500 clients at time. As of this writing, we’ve moved over 5,000 sites and 100% of our display traffic is now being served from our new architecture.

More than just serving the same traffic as before, we have been able to move over display traffic for new services like our Curations product that takes in and processes millions of tweets, Instagram posts, and other social media feeds.  That our new architecture could handle without change this additional, large-scale use case is a testimony to innovative engineering and determination by our team over the last 2+ years. Our largest future opportunities are enabled because we’ve successfully been able to realize this architectural transformation.

Rearchitecting the Team

In addition to rearchitecting the service to scale, we also had to rearchitect our team. As we set out on this journey to rebuild our solution into a scalable, cloud based service oriented architecture, we had to reconsider the very way our teams are put together.  We reimagined our team structure to include all the ingredients the team needs to go fast.  This meant a big investment in devops – engineers that focus on new architectures, deployment, monitoring, scalability, and performance in the cloud.

A critical part of this was a cultural transformation where the service is completely owned by the team, from understanding the requirements, to code, to automated test, to deployment, to 24×7 operation.  This means building out a complete monitoring and alerting infrastructure and that the on-call duty rotated through all members of the team.  The result is the team becomes 100% aligned around the success of the service and there is no “wall” to throw anything over – the commitment and ownership stays with the team.

For this team architecture to succeed, the critical element is to ensure the team has all the skills and team players needed to succeed.  This means platform services to support the team, strong product and program managers, talented QA automation engineers that can build on a common automation platform, gifted technical writers, and of course highly talented developers.  These teams are built to learn fast, build fast, and deploy fast, completely independent of other teams.

Supporting the service-oriented teams, a key element is our Platform Infrastructure team we created to provide a common set of cloud services to support all our teams.  Platform Infrastructure is responsible for the virtual private cloud (VPC) supporting the new services running in amazon web services. This team handles the overall concerns of security, network, service discovery, and other common services within the VPC. They also set up a set of best practices, such as ensuring all cloud instances are tagged with the name of the team that started them.

To ensure the best practices are followed, the platform infrastructure team created “Beavers” (a play on words describing a engineer at Bazaarvoice, a “BVer”). An idea borrowed from Netflix’s chaos monkeys, these are automated processes that run and examine our cloud environment in real time to ensure the best practices are followed.  For example, the “Conformity Beaver” runs regularly and checks to make sure all instances and buckets are tagged with team names. If it finds one that is not, it infers the owner and emails team aliases of the problem.  If not corrected, Conformity Beaver can terminate the instance.  This is just one example of the many Beavers we have created to help maintain consistency in a world where we have turned teams lose to move as quickly as possible.

An additional key common capability created by the Platform Infrastructure team is our Badger monitoring services. Badger enables teams to easily plug in a common healthcheck monitoring capability and can automatically discover nodes as they are started in the cloud. This service enables teams to easily implement these healthcheck that is captured in a common place and escalated through a notification system in the event of a service degradation.

The Proof is in the Pudding

The Black Friday and Holiday shopping season of 2015 was one of the smoothest ever in the history of Bazaarvoice while serving record traffic. From Black Friday to Cyber Monday, we saw over 300 million visitors.  At peak on Black Friday, we were seeing over 97,000 requests per second as we served up over 2.6 billion review impressions, a 20% increase over the year before.  There have been years of hard work and innovation that preceded this success and it is a testimony to what our new architecture is capable of delivering.

Keys to success

A few ingredients we’ve found to be important to successfully pull off a large scale rearchitecture such as described here:

  • Brilliant people. There is no replacement for brilliant engineers who are fearless in adopting new technologies and tackling what some will say can’t be done.
  • Strong leaders – and the right leaders at the right time. Often the leaders that sell the vision and get an undertaking like this going will need to be supplemented with those that can finish strong.
  • Perseverance and Determination – building a new platform using new technologies is going to be a much bigger challenge than you can estimate, requiring new skills, new approaches, and lots of mistakes. You must be completely determined and focused on the end game.
  • Tie back to business benefit – keep business informed of the benefits and ensuring that those benefits can be delivered continuously rather than a big bang. It will be a large investment and it is important that the business see some level of return as quickly as possible.
  • Make space for innovation – create room for engineers to learn and grow. We support this through organizing hackathons and time for growth projects that benefit the individual, team, and company.

Reachitecture is a Journey

One piece of advice: don’t be too critical of yourself along the way; celebrate each step of the reachitecture journey. As software engineers, we are driven to see things “complete”, wrapped up nice and neat, finished with a pretty bow. When replacing an existing system of significant complexity, this ideal is a trap because in reality you will never be complete.  It has taken us over 3 years of hard work to reach this point, and there are more things we are in the process of moving to newer architectures. Once we complete the things in front of us now, there will be more steps to take since we live in an ever evolving landscape. It is important to remember that we can never truly be complete as there will be new technologies, new architectures that deliver more capabilities to your customers, faster, and at a lower cost. Its a journey.

Perhaps that is the reason many companies can never seem to get started. They understandably want to know “When will it be done?” “What is it going to cost?”, and the unpopular answers are of course, never and more than you could imagine.  The solution to this puzzle is to identify and articulate the business value to be delivered as a step in the larger design of a software platform transformation.  Trouble is of course, you may only realistically be able to design the first few steps of your platform rearchitecture, leaving a lot of technical uncertainty ahead. Get comfortable with it and embrace it as a journey.  Engineer solid solutions in a service oriented way with clear interfaces and your customers will be happy never knowing they were switched to the next generation of your service.

authored by Gary Allison