Updating search results...

Search Resources

36 Results

Selected filters:
  • The Carpentries
Introduction to R for Geospatial Data
Unrestricted Use
0.0 stars

The goal of this lesson is to provide an introduction to R for learners working with geospatial data. It is intended as a pre-requisite for the R for Raster and Vector Data lesson for learners who have no prior experience using R. This lesson can be taught in approximately 4 hours and covers the following topics: Working with R in the RStudio GUI Project management and file organization Importing data into R Introduction to R’s core data types and data structures Manipulation of data frames (tabular data) in R Introduction to visualization Writing data to a file The the R for Raster and Vector Data lesson provides a more in-depth introduction to visualization (focusing on geospatial data), and working with data structures unique to geospatial data.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Introduction to the Command Line for Economics
Unrestricted Use
0.0 stars

Command line interface (OS shell) and graphic user interface (GUI) are different ways of interacting with a computer’s operating system. The shell is a program that presents a command line interface which allows you to control your computer using commands entered with a keyboard instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination. There are quite a few reasons to start learning about the shell: The shell gives you power. The command line gives you the power to do your work more efficiently and more quickly. When you need to do things tens to hundreds of times, knowing how to use the shell is transformative. To use remote computers or cloud computing, you need to use the shell.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Introduction to the Command Line for Genomics
Unrestricted Use
0.0 stars

Data Carpentry lesson to learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards with genomics data. Command line interface (OS shell) and graphic user interface (GUI) are different ways of interacting with a computer’s operating system. The shell is a program that presents a command line interface which allows you to control your computer using commands entered with a keyboard instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination. There are quite a few reasons to start learning about the shell: For most bioinformatics tools, you have to use the shell. There is no graphical interface. If you want to work in metagenomics or genomics you’re going to need to use the shell. The shell gives you power. The command line gives you the power to do your work more efficiently and more quickly. When you need to do things tens to hundreds of times, knowing how to use the shell is transformative. To use remote computers or cloud computing, you need to use the shell.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Life Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Introduction to web scraping
Unrestricted Use
0.0 stars

Web scraping is the process of extracting data from websites. Some data that is available on the web is presented in a format that makes it easier to collect and use it, for example in the form of downloadable comma-separated values (CSV) datasets that can then be imported in a spreadsheet or loaded into a data analysis script. Often however, even though it is publicly available, data is not readily available for reuse. For example it can be contained in a PDF, or a table on a website, or spread across multiple web pages. There are a variety of ways to scrape a website to extract information for reuse. In its simplest form, this can be achieved by copying and pasting snippets from a web page, but this can be unpractical if there is a large amount of data to be extracted, or if it spread over a large number of pages. Instead, specialized tools and techniques can be used to automate this process, by defining what sites to visit, what information to look for, and whether data extraction should stop once the end of a page has been reached, or whether to follow hyperlinks and repeat the process recursively. Automating web scraping also allows to define whether the process should be run at regular intervals and capture changes in the data.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Intro to R and RStudio for Genomics
Unrestricted Use
0.0 stars

Welcome to R! Working with a programming language (especially if it’s your first time) often feels intimidating, but the rewards outweigh any frustrations. An important secret of coding is that even experienced programmers find it difficult and frustrating at times – so if even the best feel that way, why let intimidation stop you? Given time and practice* you will soon find it easier and easier to accomplish what you want. Why learn to code? Bioinformatics – like biology – is messy. Different organisms, different systems, different conditions, all behave differently. Experiments at the bench require a variety of approaches – from tested protocols to trial-and-error. Bioinformatics is also an experimental science, otherwise we could use the same software and same parameters for every genome assembly. Learning to code opens up the full possibilities of computing, especially given that most bioinformatics tools exist only at the command line. Think of it this way: if you could only do molecular biology using a kit, you could probably accomplish a fair amount. However, if you don’t understand the biochemistry of the kit, how would you troubleshoot? How would you do experiments for which there are no kits? R is one of the most widely-used and powerful programming languages in bioinformatics. R especially shines where a variety of statistical tools are required (e.g. RNA-Seq, population genomics, etc.) and in the generation of publication-quality graphs and figures. Rather than get into an R vs. Python debate (both are useful), keep in mind that many of the concepts you will learn apply to Python and other programming languages. Finally, we won’t lie; R is not the easiest-to-learn programming language ever created. So, don’t get discouraged! The truth is that even with the modest amount of R we will cover today, you can start using some sophisticated R software packages, and have a general sense of how to interpret an R script. Get through these lessons, and you are on your way to being an accomplished R user! * We very intentionally used the word practice. One of the other “secrets” of programming is that you can only learn so much by reading about it. Do the exercises in class, re-do them on your own, and then work on your own problems.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Life Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
La Terminal de Unix
Unrestricted Use
0.0 stars

Software Carpentry lección para la terminal de Unix La terminal de Unix ha existido por más tiempo que la mayoría de sus usuarios. Ha sobrevivido tanto tiempo porque es una herramienta poderosa que permite a las personas hacer cosas complejas con sólo unas pocas teclas. Lo más importante es que ayuda a combinar programas existentes de nuevas maneras y automatizar tareas repetitivas, en vez de estar escribiendo las mismas cosas una y otra vez. El uso del terminal o shell es fundamental para usar muchas otras herramientas poderosas y recursos informáticos (incluidos los supercomputadores o “computación de alto rendimiento”). Esta lección te guiará en el camino hacia el uso eficaz de estos recursos.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Library Carpentry: Introduction to Git
Unrestricted Use
0.0 stars

Library Carpentry lesson: An introduction to Git. What We Will Try to Do Begin to understand and use Git/GitHub. You will not be an expert by the end of the class. You will probably not even feel very comfortable using Git. This is okay. We want to make a start but, as with any skill, using Git takes practice. Be Excellent to Each Other If you spot someone in the class who is struggling with something and you think you know how to help, please give them a hand. Try not to do the task for them: instead explain the steps they need to take and what these steps will achieve. Be Patient With The Instructor and Yourself This is a big group, with different levels of knowledge, different computer systems. This isn’t your instructor’s full-time job (though if someone wants to pay them to play with computers all day they’d probably accept). They will do their best to make this session useful. This is your session. If you feel we are going too fast, then please put up a pink sticky. We can decide as a group what to cover.

Applied Science
Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Plotting and Programming in Python
Unrestricted Use
0.0 stars

This lesson is part of Software Carpentry workshops and teach an introduction to plotting and programming using python. This lesson is an introduction to programming in Python for people with little or no previous programming experience. It uses plotting as its motivating example, and is designed to be used in both Data Carpentry and Software Carpentry workshops. This lesson references JupyterLab, but can be taught using a regular Python interpreter as well. Please note that this lesson uses Python 3 rather than Python 2.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Programming with MATLAB
Unrestricted Use
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to MATLAB is built around a common scientific task: data analysis. Our real goal isn’t to teach you MATLAB, but to teach you the basic concepts that all programming depends on. We use MATLAB in our lessons because: we have to use something for examples; it’s well-documented; it has a large (and growing) user base among scientists in academia and industry; and it has a large library of packages available for performing diverse tasks. But the two most important things are to use whatever language your colleagues are using, so that you can share your work with them easily, and to use that language well.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Programming with Python
Unrestricted Use
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to Python is built around a common scientific task: data analysis. Arthritis Inflammation We are studying inflammation in patients who have been given a new treatment for arthritis, and need to analyze the first dozen data sets of their daily inflammation. The data sets are stored in comma-separated values (CSV) format: each row holds information for a single patient, columns represent successive days. The first three rows of our first file look like this: 0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0 0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1 0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1 Each number represents the number of inflammation bouts that a particular patient experienced on a given day. For example, value “6” at row 3 column 7 of the data set above means that the third patient was experiencing inflammation six times on the seventh day of the clinical study. So, we want to: Calculate the average inflammation per day across all patients. Plot the result to discuss and share with colleagues. To do all that, we’ll have to learn a little bit about programming.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Programming with R
Unrestricted Use
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to R is built around a common scientific task: data analysis. Our real goal isn’t to teach you R, but to teach you the basic concepts that all programming depends on. We use R in our lessons because: we have to use something for examples; it’s free, well-documented, and runs almost everywhere; it has a large (and growing) user base among scientists; and it has a large library of external packages available for performing diverse tasks. But the two most important things are to use whatever language your colleagues are using, so you can share your work with them easily, and to use that language well. We are studying inflammation in patients who have been given a new treatment for arthritis, and need to analyze the first dozen data sets of their daily inflammation. The data sets are stored in CSV format (comma-separated values): each row holds information for a single patient, and the columns represent successive days. The first few rows of our first file look like this: 0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0 0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1 0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1 0,0,2,0,4,2,2,1,6,7,10,7,9,13,8,8,15,10,10,7,17,4,4,7,6,15,6,4,9,11,3,5,6,3,3,4,2,3,2,1 0,1,1,3,3,1,3,5,2,4,4,7,6,5,3,10,8,10,6,17,9,14,9,7,13,9,12,6,7,7,9,6,3,2,2,4,2,0,1,1 We want to: load that data into memory, calculate the average inflammation per day across all patients, and plot the result. To do all that, we’ll have to learn a little bit about programming.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Project Organization and Management for Genomics
Unrestricted Use
0.0 stars

Data Carpentry Genomics workshop lesson to learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. Good data organization is the foundation of any research project. It not only sets you up well for an analysis, but it also makes it easier to come back to the project later and share with collaborators, including your most important collaborator - future you. Organizing a project that includes sequencing involves many components. There’s the experimental setup and conditions metadata, measurements of experimental parameters, sequencing preparation and sample information, the sequences themselves and the files and workflow of any bioinformatics analysis. So much of the information of a sequencing project is digital, and we need to keep track of our digital records in the same way we have a lab notebook and sample freezer. In this lesson, we’ll go through the project organization and documentation that will make an efficient bioinformatics workflow possible. Not only will this make you a more effective bioinformatics researcher, it also prepares your data and project for publication, as grant agencies and publishers increasingly require this information. In this lesson, we’ll be using data from a study of experimental evolution using E. coli. More information about this dataset is available here. In this study there are several types of files: Spreadsheet data from the experiment that tracks the strains and their phenotype over time Spreadsheet data with information on the samples that were sequenced - the names of the samples, how they were prepared and the sequencing conditions The sequence data Throughout the analysis, we’ll also generate files from the steps in the bioinformatics pipeline and documentation on the tools and parameters that we used. In this lesson you will learn: How to structure your metadata, tabular data and information about the experiment. The metadata is the information about the experiment and the samples you’re sequencing. How to prepare for, understand, organize and store the sequencing data that comes back from the sequencing center How to access and download publicly available data that may need to be used in your bioinformatics analysis The concepts of organizing the files and documenting the workflow of your bioinformatics analysis

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Life Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Python for Humanities
Unrestricted Use
0.0 stars

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
R for Reproducible Scientific Analysis
Unrestricted Use
0.0 stars

This lesson in part of Software Carpentry workshop and teach novice programmers to write modular code and best practices for using R for data analysis. an introduction to R for non-programmers using gapminder data The goal of this lesson is to teach novice programmers to write modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages. We find that many scientists who come to Software Carpentry workshops use R and want to learn more. The emphasis of these materials is to give attendees a strong foundation in the fundamentals of R, and to teach best practices for scientific computing: breaking down analyses into modular units, task automation, and encapsulation. Note that this workshop will focus on teaching the fundamentals of the programming language R, and will not teach statistical analysis. The lesson contains more material than can be taught in a day. The instructor notes page has some suggested lesson plans suitable for a one or half day workshop. A variety of third party packages are used throughout this workshop. These are not necessarily the best, nor are they comprehensive, but they are packages we find useful, and have been chosen primarily for their usability.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
R para Análisis Científicos Reproducibles
Unrestricted Use
0.0 stars

Una introducción a R utilizando los datos de Gapminder. El objetivo de esta lección es enseñar a las programadoras principiantes a escribir códigos modulares y adoptar buenas prácticas en el uso de R para el análisis de datos. R nos provee un conjunto de paquetes desarrollados por terceros que se usan comúnmente en diversas disciplinas científicas para el análisis estadístico. Encontramos que muchos científicos que asisten a los talleres de Software Carpentry utilizan R y quieren aprender más. Nuestros materiales son relevantes ya que proporcionan a los asistentes una base sólida en los fundamentos de R y enseñan las mejores prácticas del cómputo científico: desglose del análisis en módulos, automatización tareas y encapsulamiento. Ten en cuenta que este taller se enfoca en los fundamentos del lenguaje de programación R y no en el análisis estadístico. A lo largo de este taller se utilizan una variedad de paquetes desarrolados por terceros, los cuales no son necesariamente los mejores ni se encuentran explicadas todas sus funcionalidades, pero son paquetes que consideramos útiles y han sido elegidos principalmente por su facilidad de uso.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added:
Version Control with Git
Unrestricted Use
0.0 stars

This lesson is part of the Software Carpentry workshops that teach how to use version control with Git. Wolfman and Dracula have been hired by Universal Missions (a space services spinoff from Euphoric State University) to investigate if it is possible to send their next planetary lander to Mars. They want to be able to work on the plans at the same time, but they have run into problems doing this in the past. If they take turns, each one will spend a lot of time waiting for the other to finish, but if they work on their own copies and email changes back and forth things will be lost, overwritten, or duplicated. A colleague suggests using version control to manage their work. Version control is better than mailing files back and forth: Nothing that is committed to version control is ever lost, unless you work really, really hard at it. Since all old versions of files are saved, it’s always possible to go back in time to see exactly who wrote what on a particular day, or what version of a program was used to generate a particular set of results. As we have this record of who made what changes when, we know who to ask if we have questions later on, and, if needed, revert to a previous version, much like the “undo” feature in an editor. When several people collaborate in the same project, it’s possible to accidentally overlook or overwrite someone’s changes. The version control system automatically notifies users whenever there’s a conflict between one person’s work and another’s. Teams are not the only ones to benefit from version control: lone researchers can benefit immensely. Keeping a record of what was changed, when, and why is extremely useful for all researchers if they ever need to come back to the project later on (e.g., a year later, when memory has faded). Version control is the lab notebook of the digital world: it’s what professionals use to keep track of what they’ve done and to collaborate with other people. Every large software development project relies on it, and most programmers use it for their small jobs as well. And it isn’t just for software: books, papers, small data sets, and anything that changes over time or needs to be shared can and should be stored in a version control system.

Computer Science
Computer, Networking and Telecommunications Systems
Information Science
Measurement and Data
Material Type:
The Carpentries
Date Added: