Saturday London 2023: Speakers | R bloggers

- Advertisement -


Saturday London is fast approaching, and we are pleased to announce the full lineup of speakers for the event! Read on for more info. If you want to join in the fun, visit the conference website to sign up!

- Advertisement -

Data comes in all shapes and sizes. It can often be difficult to know where to start. Whatever your problem, Jumping River can help.

Keynote Speaker Julia Silge – Posit

- Advertisement -

Julia Silage is a data scientist and software engineer at Posit PBC (formerly RStudio), where she works on open source modeling and MLOps tools. She is an author, an international keynote speaker, and a real-world practitioner with a focus on data analysis and machine learning. Julia loves text analysis, creating beautiful charts, and communicating about technical topics with diverse audiences.

Oliver Hawkins – Financial Times

- Advertisement -

Oliver Hawkins works as an editorial data scientist for the visuals and data journalism team at the Financial Times. He has previously worked as a statistical researcher and a data scientist for the House of Commons Library and as a data journalist for the BBC. He is interested in statistics, machine learning and data visualization.

Contributed Talks Botan Ağın and Michael Stevens – SamKnows

Automated Reporting: Billions of Internet measurements, hundreds of reports, and one repository to rule them all

SamKnows has been a leader in Internet performance measurement for over 14 years. The reason we exist is to provide the source of truth for how the Internet is actually performing. The data we collect can be used as a common language between government regulators, Internet service providers, academia, and content providers to optimize and improve Internet performance for all.

SamKnows uses R to handle a huge range of automated and self-service workloads every day. Keeping track of each report’s recipients, delivery schedule, dependencies, and deployment process can be difficult, especially in the nightmare scenario of suddenly needing to move all of your jobs to a new server or cloud environment.

In this presentation, we’ll talk about how we structure our regularly scheduled reports as standardized entities within a monorepo. We’ll explain how this approach reduces latency in setting up a report, makes it easier for new team members to contribute, and allows us to maintain standards while maintaining the flexibility of delivering work in a variety of formats, including There are levels of complexity and opportunities for manuals. Interference. We’ll detail the specific workflows that take the terabytes of data collected by SamKnows from cloud and on-premises data sources, process them into an R Markdown document, formatted spreadsheet, and raw CSV output, and store them in Distributed via cloud file storage. FTP Server, Email, Slack and more.

Vyra Apostolova and Laura Cole – National Audit Office

government spending check

“The National Audit Office supports Parliament in holding the government to account through its financial audit and value for money work. The Analysis Hub is a central team that uses a range of analytical techniques to support both aspects of the work. The proposed presentation will show two examples of how we at Analysis Hub use R to support our mission of holding government accountable.

We use R to reproduce the complex models that departments employ to produce accounting estimates for their financial accounts. Our R reproductions allow us to assess whether the departments have correctly applied their chosen methodology and to uncover any model integrity issues. We also apply additional sensitivity tests through Monte Carlo simulations to capture the uncertainty around model outputs. The presentation will include an overview of our approach and a demo of reproducing a dummy model.

We have also created an R-Shine app, the COVID-19 Cost Tracker, which brings together UK Government data on the cost of measures in response to the COVID-19 pandemic. It is one of the very few sources of comprehensive information on COVID-19 related spending and the only one as an interactive tool. With this the public can check spending by department and interact with bubble graphs to find out the cost of individual policies along with category of spending. The presentation will include an overview of how the data analytics team and audit team collaborated to produce outputs and demos of the app.

Andrew Collier – Fathom Data

Dark Corners of the Tidyverse

“Within the Tideverse realm, there are works that are always in the limelight. These are the Titans: famous and beloved, often invoked and virtually indispensable. There are other, lesser known works that stand quietly in the shadows. Unknown, somewhat obscure and almost forgotten. Waiting for his moment to shine.

I’ll talk about five of these Unsung Heroes of the Tideverse, exemplify their qualities, and show how they can help you succeed in your next Data Science pursuit.

Jack Davison – Ricardo Energy & Environment

“Put it on the map!” – Advances in Air Quality Data Analysis

“An understanding of air quality is important because it can have significant public health, environmental and economic impacts. However, air quality data is complex, constantly changing over space and time, and influenced by myriad factors such as meteorology and human activity This makes air quality analysis challenging, and communicating the results of this analysis is more challenging still!

Exactly a decade ago, the {OpenAir} package was written to provide an open-source toolkit to help air quality practitioners make the most of their data, and is still widely used today in academia, consulting, and industry. is used from. Although {openair} hasn’t changed much in recent years, a lot of thought has gone into extending it by taking advantage of more recent tools and packages.

In this talk I will discuss how we recently married {leaflet} and {openair} to create effective, interactive air quality maps. In particular, I will discuss the development of the {openairmaps} package – a toolset that makes it easy to create interactive “directional analysis” maps to help explore the geospatial context of pollution monitoring data.”

Russ Hyde – Jumping River

Does code quality even matter in Data Science?

“depends on!
If you need to quickly summarize some data for an ad hoc request, eliminate that code in whatever way gets the job done.

But what happens when you start getting a lot of similar requests, or you’re working on a more important project, or you’re collaborating with a large team? Now, productivity needs to be seen ‘across the team’ and ‘across projects’. What can you do to help yourself and your co-workers, and what tools are there to help?

Code quality relates to those aspects of software that make it easy to work with, easy to explain to others and easy to maintain or extend.

In this talk, I’ll take you through the source code for an emerging analytics project. We’ll discuss how (and how not to) code modularly. Along the way, we’ll talk about functions and calculations, body-tweaking, duplicate stomping, and some tools to help automate the boring low-level stuff that teams sometimes disagree about.

Ella Kaye and Heather Turner – University of Warwick

Sustainability and EDI (Equality, Diversity and Inclusion) in the R project.

The R project is over 20 years old, but its future is not secure – many people on the R core team are nearing retirement and there are not enough new contributors to keep up with the work. We present several initiatives organized under Heather Turner’s ‘Sustainability and EDI (Equality, Diversity and Inclusion) in the R Project Fellowship to encourage and train a new, more diverse, generation of contributors. These include R Contributor Office Hours, Collaboration Campfires, Bug BBQs, Transathlon, and an updated R Development Guide. The presentation is also a call to action to encourage others to join in support of this language, which is a fundamental piece of software across many disciplines used by an estimated 2 million people.

For updates and revisions to this article, see the original post

Connected



Source link

- Advertisement -

Recent Articles

Related Stories