15
submitted 1 day ago* (last edited 1 day ago) by cactus_head@programming.dev to c/linux@lemmy.ml

I have a program that require all keywords to be in a single paragraph, most of the time, separated by commas

For example:

I have those terms

1-Term
1.1-Term
2-Term
3-Term
4-Term

That i collected and organized into groups and subgroups with Titles and subtitles

Title

  • 1-Term

  • 1.1-Term

  • 2-Term

    • Sub-Title
      • 3-Term
      • 4-Term

But then i want to turn them into:

1-Term, 1.1-Term, 2-Term, 3-Term, 4-Term 
 

Removing certain marked words(Titles and sub-Titles), any Empty/Blank space, and Line breaks, while adding the commas between The Terms. I want to keep certain dashes "-"(like in words )

1-Term,1.1 -Term,2-Term,3-Term,4-Term

you are viewing a single comment's thread
view the rest of the comments
[-] bus_factor@lemmy.world 1 points 12 hours ago

If you're feeling a little old school (and some might say masochistic), you could so a similar crude parser with a perl oneliner. This would be more efficient compute wise, but it's a bit of an acquired taste readability wise:

$ perl -ne 'chomp; push @a, $1 if /^\s*-\s*(.*[^:\s])\s*$/; END{print join(",", @a), "\n"}' /tmp/foo.txt
Harry potter,Perfect Blue,Jurassic world,Jurassic Park,Jedi,Star wars,The clone wars,MCU,Gumball,Flapjack,Steven Universe,Stars vs. the forces of Evil,Wordgril,Flapjack

Here perl -n makes perl look at each line individually, chomp strips off the trailing newline, we match for /^\s*-\s*(.*[^:\s])\s*$/ (a string starting with a dash and ending with something not a colon) and append the content of the matching parenthesis to an implicitly declared array @a. Then we add an END{} block which will be executed after all lines are parsed, where we print the array joined on ,.

this post was submitted on 23 Feb 2026
15 points (100.0% liked)

Linux

62524 readers
1123 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 6 years ago
MODERATORS