Web development is arguably the most popular area of software development right now. Software developers can make snappy, eye-catching websites, and build robust APIs. I’ve recently developed a specific interest in a less discussed facet of web development: web scraping.

Web scraping is the process of programmatically analyzing a website’s Document Object Model (DOM) to extract specific data of interest. Web scraping is a powerful tool for automating certain features such as filling out a form, submitting data, etc. Some of these abilities will depend if the site allows web scraping or not. One thing to keep in mind, if you want to web scrape, is that some websites will be using cookies/session state. So some automation tasks might need to abide by the use of the site’s cookies/session state. It should go without saying, but please be a good Samaritan when web scraping since it can negatively impact site performance.

Getting Started

Let’s get started with building a web scraper in an Azure Function! For this example, I am using an HTTP Trigger Azure Function written in C#. However, you can have your Azure Function utilize a completely different trigger type, and your web scraper can be written in other languages if preferred.

Here is a list of Azure resources that were created for this demo:

Azure Resources

Before we start writing code, we need to take care of a few more things first.

Let’s first select a website to scrape data from. I feel that the CDC’s COVID-19 site is an excellent option for this demo. Next, we need to pick out what data to fetch from the website. I plan to fetch the total number of USA cases, new USA cases, and the date that the data was last updated.

Now that we have that out of the way, we need to bring in the dependencies for this solution. Luckily, there is only one dependency we need to install. The NuGet package is called HtmlAgilityPack. Once that package has been installed into our solution, we can then start coding.

Coding the Web Scraper

Since the web scraper component will be pulling in multiple sets of data, it is good to capture them inside a custom resource model. Here is a snapshot of the resource model that will be used for the web scraper.

Resource Model

Now it’s time to start coding the web scraper class. This class will utilize a few components from the HtmlAgilityPack package that was brought into the project earlier.

Web Scraper Class

The web scraper class has a couple of class-level fields, one public method, and a few private methods. The method “GetCovidStats” performs a few simple tasks to get our data from the website. The first step is setting up an HTML document object that will be used to load HTML and parse the actual HTML document we get back from the site. Then, there is an HTTP call out to the website we want to hit.

Right after that, we ensure the call out to the website results in a success status code. If not, an exception is thrown with a few details of the failing network call.

Ensure the Call Out

We then load the HTML that we received back from the network call into our HTML document object. There are several calls to a method that will perform the extraction of the data we are looking for. Now you might be wondering what those long strings are in the method call. Those are the full xpaths for each targeted HTML element. You can obtain them by opening the dev tools in your browser, selecting the HTML element, and right-clicking it in the dev tools. From there, select “copy full xpath”.

Load the HTML

Next, we need to set up the endpoint class for our Azure Function. Luckily for us, the out of the box template sets up a few things automatically. In the endpoint class, we are merely calling our web scraper class and returning its results to the client calling the Azure Function.

Endpoint Class for Azure Function

Now comes time to test out the Azure Function! I used Postman for this, and these are the results.

Test out the Azure Function

Closing Thoughts

Overall, web scraping can be a powerful tool at your disposal if sites do not offer APIs for you to consume. It allows you to swiftly grab essential data off of a site and even automate specific tasks in the browser. With great power comes great responsibility, so please use these tools with care!

K.C. Jones-Evans, a User Experience Developer, Josh Lathery, a Full Stack Developer, and Sara Darrah, a User Experience Specialist, sat down recently to talk through our design and project development planning process that we implemented for part of a project. This exercise was to help improve our overall project development planning and create best practices moving forward. We coined the term the “Design Huddle” to describe the process of taking the feature from a high level (often one sentence request from our customer) to a working product in our software, improving project planning for software development.

We started the design huddle because the contract we were working on already had a software process that did not include User Experience (UX). We knew we needed to include UX, but weren’t sure how it would work given the fast paced (2-week sprint cycles) software process we were contractually obligated to follow. We needed to come up with a software design planning solution that allowed us to work efficiently and cohesively. The Design Huddle allowed us to do just that.

What does the design huddle mean to you?

Josh: Previously the technical lead would have all the design processes worked out prior to being assigned a ticket for development. The huddle meant that I had more ownership in the feature upfront. It was nice to understand via the design process what the product would be used for and why.

K.C.: The huddle was an opportunity to get our thoughts together on the full product before diving into development right away. In the past, we have developed too quickly and discovered major issues. Development early equated to too much ownership in the code, so changes were painful when something needed to be corrected.

Sara: The huddle for me meant the opportunity to meet with the developers early to get on the same page prior to development. That way when we had the final product, we could discuss details and make changes, but we were coming from the same starting point. I have been on other projects where I’m not brought in until after the development is finished- which

immediately strains the relationship between UX and Development due to big changes needed to finished code.

What does the design huddle look like?

  • The Team: UX Designer, at least one front-end or UX developer, a full-stack developer, the software tester, a graphic designer (as needed), and a Subject Matter Experts (as needed)
  • The Meeting: This took time to work out. As with any group, sometimes there were louder or more passionate individuals that seemed to overshadow the rest. At the end of the day the group worked better with order and consensus:
    • Agendas were key: The UX lead created the agendas for our meetings. Without an agenda it was too easy to go down a rabbit hole of code details. This also helped folks who were spread across multiple projects focus on the task at hand faster. We included time to report on action items, old business/review of any design items and set the stage of what you hope to cover as new design work.
    • Action Items: Create and assign actions to maintain in the task management system (JIRA). This was a good translation for developers and helped everyone understand their responsibility leaving the room. These also really helped with sprint planning and the ability to scope tasking.
    • The facilitator had to be assertive: Yes, we are all professionals and in the ideal world we could “King Arthur Round Table” these huddles. But the few times we tried this meeting were quickly off-track and down a rabbit hole. Many teammates would leave frustrated and feeling like we hadn’t made any progress. The facilitator was the UX specialist for our meetings, but we think the owner of the feature should facilitate. The facilitator needs to be willing to assert themselves during conversations, keep the meeting on track and force topics to be tabled for another time when needed.
    • Include everyone and know the crowd: The facilitator needs to quickly understand the team they are working with to figure out how to include everyone. One way we ensured this happen was to do an around the room at the end of each meeting.
    • Visual Aid that the whole team can see during the meeting: Choose the tool that works best for the topic at hand- a dry erase board, a wireframe, JIRA tickets, or a mock-up can help people stay on track and ensure common understanding.
    • Table tough items and know when to end the meeting: sometimes in the larger meetings we needed to call it quits on a debate to give more time for thinking, research, and discussions. Any member of the team could ask for something to be tabled. A tabled item was given a smaller group of individuals to work through the details in between the regular design huddle sessions.
    • Choose the team participants wisely: The first few meetings most likely will involve the entire team, but smaller huddles (team of 2 or 3) can often work through details of tabled items more efficiently.

What is the benefit?

  • Everyone felt ownership in the product by the end of the design. Sara’s favorite thing about this learning experience was when one of the developers I hadn’t worked with before said he loved the process. He had the opportunity to provide input early and often, and then by the time development started there weren’t really any questions left on how to implement. Josh- the process helped me feel like an engineer and not just a code monkey. K.C.-: In other projects, we have been handed mock-ups without context. This process helped the “gray area” be taken away.
  • Developers were able to tame the ideals of the UX designer by understanding the system, not just the User Interface. Developers could assist UX by asking questions, helping understand the existing system limitations, and raising concerns that the design solution was too complicated given the time we had.
  • The software tester was able to understand the flow of the new designs, ask questions and assist in writing acceptance criteria that was specific, measurable, attainable, realistic and testable (SMART).
  • She provided guidance when we needed to go from concept to reality and ensured we understood the design requirements.
  • It was critical to designing 1-2 sprints in front of expected development as part of Agile software development. It allowed for the best design to be used rather than forcing ourselves into something that could be completed in two weeks. Together we would know what our end goal was, then break down the concept into tiers. Each tier would have a viable product that was one sprint long, and always kept the end product design in mind.

The Design Huddle is a way for teams to collaborate early on a new feature or application. We feel it is a great way to work User Experience into the Agile software process and simplify project development planning. We have taken our lessons learned and applied to a new project the three of us are getting to tackle together and expanded the concept of the huddle to different members on the team. If you are struggling to incorporate proper design or feel frustration from teammates on an application task, this may be the software design planning solution for you!