List of tuples comprehension Python

Loop Killers: Python Zips and Comprehensions by Example

Your Python Code Arrows to Dispatch Jurassic Loop Holdouts

Heiko Onnen

Oct 20, 2021·16 min read

image by Stefan Keller, Fantasy Dino Dinosaur Free photo on Pixabay
image by author

A tutorial by example:

  • List, set, and dictionary comprehensions, with conditions or filter-like behavior;
  • the zipping and unzipping of lists, dictionaries, and tuples in conjunction with comprehensions;
  • followed by some speed measurements that will compare the performance of old-style loops with comprehensions.

0. The Pythonic Character of Comprehensions

If you like I did came to Python from other programming languages that do not offer similar objects, youve probably been puzzled when you were confronted with list comprehensions for the first time; and were amazed when you grasped how they can make your code more concise and faster. Comprehensions, like loops, serve the purposes of filtering lists or dictionaries, extracting items, and transforming values. Lets walk through a sequence of examples for both loops and comprehensions.

1. Zips and Comprehensions: Dealing with Lists, Tuples, and Dictionaries

1.1 Zipping

1.1.a Zipping Lists of Equal Length

Suppose we have to process an unstructured group of values we have received from separate methods in a Python script. In our case, we assume that we need to deal with a list of three prediction accuracy metrics: RMSE, MAPE, and R-squared. We want to avoid passing each of these variables individually to other methods. Rather, the three metrics should be processed in tandem. Therefore, we will collect them in a list or dictionary and then demonstrate how we can operate with them and on them.

We combine the numerical values in a list, acc_values.

To distinguish the metrics, we write down their names in a second list, acc_names, taking care that it matches the sort order of the first list.

image by author

Lines 23 and 27 are alternative formulations, they generate the same list of names.

Stealthily, weve introduced an operation on lists without using a keyword in line 27. Note the asterisk * in front of the list variable, and the comma behind it. The syntax

  • *listname, = list of comma-separated values

generates a list. The asterisk * is actually known as an unzipping operator. But it can also combine items to create a list when we apply the syntax of line 27.

Next, we combine the two lists by zipping them. Syntax:

  • mylist = list[zip[list1, list2, ]]
image by author
image by author

Pythons zip function pairs the corresponding items of two [or more] lists in a single list of tuples, without the need for a multi-line loop.

1.1.b Unzipping a List of Tuples

What about the opposite case? Can we, if we are confronted with a list of tuples, unzip them to obtain separate lists without needing to write a loop? Yes, we precede the list variable with an asterisk or star * to unpack the tuples. Syntax:

  • var1, , varN = zip[*mylist]
image by author
image by author

1.1.c Zipping Lists of Unequal Length, with Omissions

If we deal with multiple lists that differ in their item counts, we can still combine them to a list of tuples despite the unequal length by applying the list[] constructor. However, the zip function takes the shortest list and omits the corresponding elements of longer lists.

Syntax:

  • mylist = list[zip[list1, , listN]]
image by author
image by author

1.1.d Zipping Lists of Unequal Length, with Padding

Skipping the excess items which the longer lists contain will not be the preferred behavior in most cases. Luckily, the itertools library comes to the rescue and provides the zip_longest function. It inserts all the items of the longest list into the tuples and then writes None into those tuple items to which the shorter lists cannot contribute values. Here, for instance, the preceding methods do not provide a value for the fourth metric, MSE, therefore zip_longest inserts a value of None to complete the last tuple.

Note again our mantra of conciseness: where possible, we want to create a list in a single line of code, like the one in row 8 below; hence, without a loop that would require several lines.

Syntax:

  • mylist = list[zip_longest[list1, , listN]]
image by author
image by author

1.2a Zipping a List of Tuples to Create a Dictionary

Lets convert the list of tuples into a dictionary to get rid of the many brackets. Syntax:

  • mydict = dict[mylist]
image by author
image by author

1.2b Zipping Lists to Create a Dictionary

To demonstrate how the zip functions can be applied to lists, weve made a detour and combined the original list of names and the list of values in a single list of tuples before we converted that one to a dictionary.

Now we skip a redundant step: we demonstrate how the two original lists can directly be converted to a dictionary without a detour through tuples: again by using the zip function. Row 3 represents a dictionary comprehension.

Syntax:

  • mydict = {key:val for key, val in zip[listK, listV]}
image by author
image by author

The dictionary removed the many brackets of the tuples.

But how can we generate a tabular layout that will be better readable in a report?

1.3 Unpacking a Dictionary

Before we continue with our zip-generated dictionary how do we implement the opposite operation and unpack an existing dictionary, preferably without loops?

We can assign the dictionary values to comma-separated variables on the left-hand side of code line 3. Or, instead of values, assign pairs consisting of both key and value.

Syntax:

  • var1, , varN = mydict.items[]
image by author
image by author
  • var1, , varN = mydict.values[]
image by author
image by author
image by author
image by author

Or we unpack the dictionary keys and values to two separate lists, by using the .keys[] and .values[] functions and typecasting their outcomes with the list[] constructor. Syntax:

  • listKeys = list[mydict.keys[]]
  • listValues = list[mydict.values[]]
image by author
image by author

Or, third variant, we pairwise unpack the keys and values to a list of tuples, by using both the zip and list functions in tandem.

A shorter way to the list of tuples:

image by author
image by author

Syntax:

  • listTuples = list[mydict.items[]]

1.4 The Structure of List and Dictionary Comprehensions

The general structure of comprehensions follows this pattern:

  • an expression at the beginning; in the following examples: the print function that takes they key and the value of a dictionary item and prints the pair;
  • next, the for in reference to the items [key and value] which the comprehension will extract from the dictionary [in list comprehensions, the key is not applicable]; the speed tests we will run further below will demonstrate that the for in construct within comprehensions is faster than the traditional for loops;
  • followed by the dictionary or list itself which contains all the keys and/or values which the expression is supposed to process;
  • optionally, a conditional expression [if else] can be appended to the comprehensions syntax to filter the dictionary or list; we will see an example further below.
  • Syntax: [expression for item in list]
  • example: mylist = [x**2 for x in numberslist]
  • if numberslist =[1,2,3], then mylist = [1,4,9]
  • Syntax: {expression for key, value in dictionary.items[] or in list}
  • example: mydict = {v: v**2 for v in numberslist}
  • The expression squares the values in the iterable, which is a list of numbers in the example, and then pairs the input argument [which it will interpret as the key] with its squared value.
  • If numberslist =[1,2,3], then mydict = {1:1, 2:4, 3:9}

1.5 Examples: Comprehensions Used for Printing

To see practical examples, lets create some dictionary and list comprehensions that will pretty-print multiple results. We want to obtain a report-ready output in a tabular, vertical layout:

image by author
image by author

Python offers the pretty-print library with its function pprint[]. But it does not return the layout wed prefer, the one shown below:

image by author

We could use a for-loop to print the dictionary items one by one. In this case, the loop shown above is even quite concise. But the speed tests further below will demonstrate that loops are significantly slower, whether we use them for printing or other methods. The time inefficiency would be felt when we need to cope with lists or dictionaries that contain hundreds or thousands of items.

We could convert the dictionary to a dataframe. The one-liner is a neat alternative to comprehensions.

image by author

Though if we need to deal with a much larger dictionary that contains 10,000 strings as its values and their index numbers as its keys, a conversion to a dataframe turns out to be a whopping 175 times slower than a list comprehension.

As an alternative to long-winded loops and slower dataframe conversions, lets try out a list comprehension on our dictionary.

We enclose our dictionary of three prediction accuracy metrics in a list comprehension, which I nickname a print comprehension whenever I use it for pretty-printing. The print[] function represents the expression at the start of any comprehension.

image by author
image by author

The list comprehension prints the contents of the dictionary, one below the other.

The only flaw: the list of three None values. From where do they originate? This is not a bug in the list comprehension itself. Rather, the print function is so defined that it returns None, and does it for each of the key/value pairs it processes. We are going to suppress the None values.

First approach: insert another row beneath the print comprehension.

image by author
image by author

Second approach: the additional line can also consist of a pass statement.

image by author
image by author

Third approach: assign the print comprehension to any variable.

image by author
image by author

Fourth approach: an underscore to the left serves the same purpose.

image by author
image by author

Finally, lets complete our print comprehension exercise and pretty-print the dictionary of named metrics and their numerical values [v]. The names of the metrics serve as the keys [k]. We enclose the value variable v in a number format inside the print function.

image by author
image by author

For comparison, lets have a look at an alternative code construction I saw on several websites. To print a dictionary, the websites proposed to define a function that runs through a for-loop. It does the job, but Id hesitate to call it Pythonic elegance.

By now, we are sensitive enough to loops that their occurrence in a Python script will give us some pause. Comprehensions cannot replace loops in all circumstances. But this task as simple as printing a dictionary should require no more than one line of code. Defining a function to use print[] itself a function appears redundant. If not the function, then the loop in the function body should raise our hackles and motivate a glance at possible alternatives that need fewer lines.

1.6 Summary: Zip to Dict and Print

Its time to summarize the steps weve taken and omit the alternative and intermediate solutions weve discussed in order to demonstrate the various ways in which zip functions and comprehensions can interact with one another. We are going to condense the script to four lines of code.

We wanted to deal with a group of prediction accuracy metrics and their values, which preceding calculations have generated and passed to our script.

  • We combined their separate values in a list, acc_values.
  • To correctly label the values, we created a list of names for the metrics, acc_names.
  • In row 13, zipping generates a list of tuples which the dictionary comprehension turns into a dictionary,
  • which we print in row 16 via a list comprehension.
image by author
image by author

Theoretically, we could merge rows 13 and 16 in a single row that creates the dictionary and immediately prints it.

But readability should be prioritized over reducing the number of code lines. A single line that is just twice as long does not count as an improvement. The creation of the dictionary and its print-out represent separate purposes, therefore each of them deserves its own line, as in rows 13 and 16.

2. Performance: Speed of Comprehensions and Loops

Chapter 1 has focused on the length of the code that comes with loops, in contrast to the less verbose single-liners that most comprehensions require. Comprehensions, in general, are more concise.

In this chapter, lets test whether the shorter code of comprehensions also results in faster execution.

To generate data on which a loop and a comprehension can compete with one another, we create a list we fill with random numbers. We use a short list comprehension in line 3.

The comprehension c that will serve our purpose consists of a short line of code in row 15 while the loop, as usual, needs multiple lines to process all elements of the list.

image by author
image by author

2.1 A List Comprehension to Operate on Numerical Values

After preparing the race track as shown above, we want to look for measurable speed differences. Therefore, using the same code as above, we create a much longer list of 100,000 random numbers and then unleash the loop and the comprehension on it.

image by author
image by author

The list comprehension is 21% faster than the loop.

Note that the measured times will differ whenever you rerun the code. The effective time depends on how your computer allocates processor capacity between tasks that are in the waiting line at any given moment. But the outcomes can be expected to demonstrate that comprehensions are faster than loops with few exceptions.

2.2 Conditional Expressions in List Comprehensions

Lets add another layer to the loop and the comprehension by subjecting them to a conditional expression they are to evaluate before applying the expression to the list items.

Any list value in our example should only be squared if it is larger than 90. This construction filters the source list by only passing those values which meet the condition to the list of results, after applying the expression to the selected values.

Syntax of the conditional comprehension:

  • mylist = [expression for item in list if [condition==True]]

A condition can also contain an else clause that chooses between two different expressions. Where there was a single expression at the beginning of the comprehension in the previous example, there are now two alternative expressions. But only one of them will be executed on each of the values the comprehension reads from the source list. We can again append a condition at the end, which has the effect of a filter, as in the previous example, and would thus limit the evaluations of the condition #2 to the items that meet condition #1. In other words, condition #1 selects the source data that will be passed to the two expressions at the beginning of the line; then condition #2 determines which of the two expressions, either A or B, will transform the selected source value.

Syntax:

  • mylist = [expressionA if [condition2==True] else expressionB for item in list if [condition1==True]]

Example: square an argument if it exceeds 90; but raise it to the third power if it is smaller or equal to 90; return all the exponentiated results, whether they have been squared or cubed, as a list mylist; but only if the source value is an even number as per the condition at the end of the line thus, omit odd numbers from the exponentiations and from the list of results:

  • mylist = [[x**2] if [x>90] else [x**3] for x in list if [x%2==0]]

The following example demonstrates the syntax with a single, filter-like condition #1 at the end. The comprehension only processes numbers greater than 90 and omits smaller arguments from the squared results, which therefore will form a shorter list than the source list of 100,000 small and large random numbers.

image by author
image by author

The comprehension races through the list at a 45% higher speed than the loop.

We test another variant: We formulate a loop and a list comprehension that both contain a conditional expression: square a value only if it is an even number and skip the odd-numbered arguments

image by author
image by author

The conditional list comprehension is 18.8% faster than the loop. Thats close to the low end. At times, you will see processing times that are 4050 % faster, depending on the workload prioritizations of your processors for concurrent tasks.

2.3 A List Comprehension to Operate on Strings

We have investigated the performance of list comprehensions for numerical variables. Lets check how loops and comprehensions fare with lists of strings such as names, words, or date strings.

To quickly obtain a long list of strings we want to use for search and filter exercises, we create a date range that contains the names of 10,000 weekdays and then convert the elements from datetime to string type.

We execute the conversion by means of a list comprehension [not via a loop, of course, whenever we can avoid one]. The expression element in this comprehension consists of the strftime function.

image by author
image by author

2.4 A Conditional List Comprehension To Operate on Strings

Next, we formulate a conditional expression to sieve through the list of date strings. Lets search for Saturdays and Sundays in the month of October of all years. The loop and the comprehension are supposed to modify the date strings they find by appending the suffix = Oct weekend to the dates.

We evaluate three conditions: the name of the day is [S]unday or [Sat]urday and the name of the month begins with Oc. These conditions determine whether or not a value will be subjected to the expression [here, by adding the suffix = Oct weekend]. The else term tells the comprehension to include the other date strings in the list of results without modifying them.

We could also use an additional condition to filter the values that will be subjected to the expression. For instance, only values in the year 2022 are to be evaluated. As in the example of the random numbers above, we would append such a filter condition [condition1] to the end of the list comprehension. Syntax:

  • mylist = [expressionA if [condition2==True] else expressionB for item in list if [condition1==True]]
image by author
image by author

The list comprehension, using the same conditional expression as the loop, is 21.9% faster.

2.5 Set Comprehensions

After having demonstrated how list and dictionary comprehensions outperform loops, lets look at an example of set comprehensions.

We want to find the prime numbers among the first N = 100,000 integers.

We will use an improved variant of the sieve of Eratosthenes [Sieve of Eratosthenes Wikipedia]. First, the sieve algorithm creates a set of all the integers from 2 to 100,000 [by modern definition, 1 is not a prime]. Then it paces through all the integers i up to the square root of N and discards from the set of 100,000 those numbers j which are equal or larger than the square of i. The sieve is by far not the fastest algorithm to find primes. But the mathematics aside, we can apply the logic of the sieve both in a loop and a set comprehension to compare their processing speeds.

The traditional style would use a nested loop: for i as the outer loop and for j as the inner loop. Let s create a nested comprehension to act as its challenger.

Syntax of nested comprehensions:

  • myset = {{expression[itemA, itemB] for itemA in setA} for itemB in setB}
image by author
image by author

The set comprehension is 57% faster than the loop.

2.6 Are Comprehensions the Panacea for All Iterables?

So are we done yet? Have comprehensions been proven unbeatable under any circumstances? Well, not quite.

Lets filter the list of date strings again, but this time by using a filter based on a lambda function.

The lambda-based filter takes 0.0001 seconds. The list comprehension, at 0.0033 seconds, is 22 times slower.

image by author

We will reserve horse races between list comprehensions and lambda-filters as a separate topic for another article.

3. Conclusions

Todays article has focused on the comparison of traditional loops with list, set, and dictionary comprehensions; and on the zip, unzip, and unpack methods associated with comprehensions.

The examples demonstrated

  • how comprehensions and zip functions interact with one another on lists, tuples, and dictionaries;
  • how their interactions can be coded concisely, often expressible in a single line, whereas loops would extend over multiple lines each of the examples showed an operation that in many other languages would typically require a loop;
  • and that we can expect comprehensions to offer superior speed over traditional loops.

The Jupyter notebook is available for download on GitHub: GitHub h3ik0th/LoopKillers: Pythons zip and comprehensions against loops

Video liên quan

Chủ Đề