We’ll take a look at it now with the UFOs dataset from Kaggle.Using colnames() we can take a look at the existing column names:We might want to add more clarity around the “comments” column, perhaps specifying that these aren’t metadata comments from the analyst, but an actual part of the dataset. We will use tbl_df() function to generate a tibble called tbl from hflights. a:f selects all columns from a on the left to f on the right). For example, FlightNum is changed to FlightNumber !Use rename_all() to change the names of dataframe columns without any logical condition.For example, consider that you would like to change column names, irrespective of it being a numeric or not , and if they contain Num in the column name, you want to modify it to Number.Post this operation, you can see that FlightNumber got changed to FlightNumberber and TailNum changed to TailNumber.Along with dplyr rename() , you can also rename columns of a dataframe using a logical vector or an index.Let us now modify the column name “Month” of hflights to “month” using logical vector.Another approach to rename columns of a dataframe is by using the appropriate index on the names vector.Let us now modify the column name “Distance” to “distance”. With dplyr, it’s super easy to rename columns within your dataframe. In this example, we’ll rename latitude and longitude to lat and long respectively:And there we have it! Hence, it is better to use dplyr rename instead of dplyr select to modify column names.This is similar to the code for renaming single column that we had seen above, except that we use pairs of new and old column names now.Let’s see the code for dplyr rename multiple columns in action.Imagine that you want to rename 100’s of columns at once.Using dplyr rename() is not a good option in that scenario.This is where the three variants of dplyr rename() – namely , rename_all(), rename_if(), rename_at() comes in handy.Use rename_at() to rename multiple columns at once. We will now try to modify only those column names from the tbl, where the names end with the string “Time”.First, let us select those specific columns and save it as tbl_times.Now , tbl_times contains four columns DepTime, ArrTime, ActualElapsedTime and AirTime.Let’s try to modify DepTime column name to DepartureTime by using r dplyr rename column.Verify the column names after applying the dplyr rename() function.Remember that unless you save the changes back to a variable , the changes made to a dataframe using dplyr operations doesn’t come into effect.So, if you want the renamed column name to be applied to your tibble, you will need to save it back to a variable again.We can use dplyr select to rename a dataframe column as well.But the main difference is that, dplyr select() keeps only the variable you specify; dplyr rename() keeps all variables of dataframe intact.What I mean is , if my dataframe has col1, col2, col3 and col4, and I am modifying col1 to column1 using select, then only column1 will be present in the resulting dataframe.If I use rename() , then column1, col2, col3, and col4 will be present in the resulting dataframe.
I can rename this ‘assignee.login’ column before removing all the columns that start with ‘assignee’ together. We can confirm that our change has been made by re-running colnames:What about if we wanted to rename more than one column in a single statement? The first step before using rename() is to know what are the existing column names.This is done using colnames().Lets use dplyr rename to modify column names in a dataframe or a tibble. Well this is easily done too. All rights reserved I have modified ArrTime to ArrivalTime, but tbl_times now contains ArrivalTime only ! As you can see, it’s super easy to rename columns with dplyr.ufos <- ufos %>% rename(spotter.comments = comments) ufos <- ufos %>% rename(lat = latitude, long = longitude ) Let’s see the code for dplyr rename multiple columns in action. This can be handy if you want to join two dataframes on a key, and it’s easier to just rename the column than specifying further in the join.Alternatively, from a data munging perspective, sometimes you can have unhelpful column names like x1, x2, x3, so cleaning these up makes your dataframes and work more legible.
dplyr provides ‘rename()’ function to, ah — , rename columns, so let’s insert a step before the second select step like below.