I need help creating a Python script that can parse Jira table markup and convert it into separate lists for each row. I’m pretty new to coding so I’m not sure about the best approach.
i’ve had success using pandas for parsing jira tables! it’s super helpful for transforming data. check their docs for examples, and you’ll get the hang of it. good luck with your code!
You can solve this with regex pretty easily. I had the same problem with Jira exports and splitting by pipe characters works great. Here’s what I use:
import re
def parse_jira_table(table_text):
lines = table_text.strip().split('\n')
rows = []
for line in lines:
if line.startswith('|') and not line.startswith('||'):
# Split by | and clean up whitespace
cells = [cell.strip() for cell in line.split('|')[1:-1]]
rows.append(cells)
return rows
This skips the header row (starts with ||) and only processes data rows. The split removes empty elements at the start and end of each line. I’ve used this in production and it handles most edge cases just fine.
Regex works but it’s overkill for simple parsing. Basic string operations handle Jira table markup just fine without the overhead. Just iterate through each line, check if it starts with a pipe to identify data rows, then split() to break into columns. Strip whitespace from each cell and skip the first/last empty elements that split() creates from the leading/trailing pipes. Way more readable for beginners and easier to debug when stuff breaks. I’ve processed thousands of Jira tables this way and rarely hit issues that need complex pattern matching.