In this tutorial, you'll have a look at PyQuery
, a Python library which allows you to make jQuery queries on XML documents. Syntactically it's quite similar to jQuery, and if you are familiar with jQuery it should be easier to follow.
Getting Started With PyQuery
To get started with PyQuery
, install the Python package using PIP.
pip install pyquery
Once you have installed PyQuery
, import it into the Python program.
from pyquery import PyQuery as pq
Let's start with a basic example of how it works. Consider the following HTML:
<div id="divMain"> <span> Hello World </span> </div>
Pass the input XML to the PyQuery
object and you should be able to apply jQuery style queries to it.
divMain = pq("<div id='divMain'><span>Hello World</span></div>")
Assume divMain
as a jQuery object and print the HTML content of the div.
print divMain('#divMain').html()
Save the changes and try running the program. You should have the following output:
<span>Hello World</span>
To access the Hello World
text from the inside the span, the code would be:
print divMain('#divMain').text()
Save the changes and try running the code. You should have the following output:
Hello World
Attributes Manipulation Using PyQuery
Now let's have a look at how to read the attributes using the PyQuery
library. Assume that you have an HTML element as shown:
<ul id="ulMain"> <li>Roy</li> <li>Hari</li> </ul>
Use the PyQuery
library to read the above HTML content.
ulMain = pq("<ul id='ulMain'><li>Roy</li><li>Hari</li></ul>")
Let's try to access the ID attribute of the ulMain
ul.
print ulMain.attr('id')
Save the above changes and try running the Python program. You should have ID ulMain
printed on the terminal.
You saw how to access the ID attribute of the ul ulMain
. Let's try to set a class name for the same ulMain
element. To specify a class name attribute, you need to specify the attribute name and its corresponding value as shown:
ulMain.attr('class','testclass')
Print the ulMain
object and you should have the element with the class attribute.
<ul id="ulMain" class="testclass"> <li>Roy</li> <li>Hari</li> </ul>
You can also add and remove classes directly without using the attr
method. To add a class to the element, you can make use of the method addClass
.
ulMain.addClass('test')
To remove an existing class from the element, you can make use of the removeClass
method.
ulMain.removeClass('test')
Handling CSS Using PyQuery
Apart from attributes, the element would have some CSS properties. To add or modify the CSS properties, you can use the css
method and specify the properties. Let's say that you want to specify the height in the style of the ul ulMain
. The following code would add the required style properties to the element.
ulMain.css('height','100px')
Save the above changes and try executing the Python program. You should have the ul ulMain
printed along with the new style.
<ul id="ulMain" style="height: 100px"> <li>Roy</li> <li>Hari</li> </ul>
To add multiple styles to the element, you can specify them as shown:
ulMain.css({'height':'100px','width':'100px'})
Run the program and you should have the styles added to the ulMain
.
<ul id="ulMain" style="width: 100px; height: 100px"> <li>Roy</li> <li>Hari</li> </ul>
Creating & Appending Elements
During dynamic element creation, you are required to create new elements and append them to the existing parent element where they'll be rendered. Let's have a look at how to create and append elements using PyQuery
.
Assume you have a main container div called divMain
.
divMain = pq("<div id='divMain'></div>")
Let's create a new span element using PyQuery
.
span = pq('<span>Hello World</span>')
Add some CSS properties to the span.
span.css({'color':'blue','font-weight':'bold'})
PyQuery
provides a method to add elements to existing elements. You can use the append
method to append the span element to the div divMain
. Here is the code:
divMain.append(span) print divMain
Save the changes and run the program. You should be able to see the divMain
printed with the newly created span appended to it.
<div id="divMain"> <span style="color: blue; font-weight: bold"> Hello World </span> </div>
You used the append method to append the span to the div. You have another alternative method called appendTo
which would append the nodes to value. In the above case you can use the method like so:
span.appendTo(divMain)
Finding Elements Using PyQuery
PyQuery
provides methods to find children, the next elements, the closest elements, etc. It also provides methods to filter and find elements inside a particular node.
Assume that you have a particular piece of HTML as shown:
<div id="divMain"> <div id="content"> <ul> <li>Jade</li> <li>Jim</li> </ul> </div> <div id="list"> <span>Hello World</span> </div> </div>
Add the following HTML to the PyQuery
object:
divMain = pq("<div id='divMain'>"+ "<div id='content'>"+ "<ul>"+ "<li>Jade</li>"+ "<li>Jim</li>"+ "</ul>" "</div>"+ "<div id='list'>"+ "<span>Hello World</span>" "</div>"+ "</div>")
Let's use PyQuery
to find the children of the div divMain
. Add the following line of code to print the children of divMain
.
print divMain.children()
On running the program, you should have the following output:
<div id="content"> <ul> <li>Jade</li> <li>Jim</li> </ul> </div> <div id="list"><span>Hello World</span></div>
To find the closest element to an element, you can use the method closest
. To find the closest div element to the span, the code would be:
print divMain('span').closest('div')
The above command would return the following output:
<div id="list"><span>Hello World</span></div>
A find
method is also available to find the elements. For example, to find a span inside the divMain
, you need to call the method as shown:
print divMain.find('span')
The above command would return the span.
<span>Hello World</span>
Insert an Element Inside HTML
While append
does the work of adding elements to the existing elements, sometimes you need to insert elements after or before certain elements. PyQuery
provides a method to insert elements before or after other elements.
Let's define a new paragraph element to be inserted after the span inside the divMain
div.
p = pq('<p>Welcome</p>')
Call the insertAfter
method on the paragraph p
element to insert it after the span.
p.insertAfter(divMain('span')) print divMain
Save the above changes and run the Python program. You should have the following output:
<div id="divMain"> <div id="content"> <ul> <li>Jade</li> <li>Jim</li> </ul> </div> <div id="list"> <span>Hello World</span> <p>Welcome</p> </div> </div>
Similarly, you have the insertBefore
method, which inserts before the element. Modify the code as shown below to use the insertBefore
method:
p.insertBefore(divMain('span')) print divMain
Save the changes and run the program. You should be able to see the following output on the terminal:
<div id="divMain"> <div id="content"> <ul> <li>Jade</li> <li>Jim</li> </ul> </div> <div id="list"> <p>Welcome</p> <span>Hello World</span> </div> </div>
Wrapping It Up
In this tutorial, you saw how to get started with PyQuery
, a Python library which allows you to make jQuery queries on XML documents. You saw how to manipulate the attributes and CSS styles of the HTML elements.
You learnt how to create and append elements to existing elements and insert new elements before and after elements. What you saw in this tutorial is just the tip of the iceberg, and there is a lot more that this library has to offer.
For more detailed information on using this library, I would recommend reading the official documentation. Do let us know your suggestions in the comments below.
Comments