Remove only half of specific adjacent duplicates in python list

Remove only half of specific adjacent duplicates in python list

I have a tool which is outputting some data . It is known that whenever '10' comes in the data it is added with extra '10' I.e new data becomes ... '10', '10', . Sometimes there can be 4 '10' in consecutive series which means that there is actually 2 '10'.

While reading the data I am trying to remove the duplicates . Till now I have learnt how to remove duplicates if only two adjacent duplicates are found but at the same time if even number of duplicates are found , I want to return half of the duplicates .

x = [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10', ....]

Expected output

[ '10', '00' , 'DF', ' 20', ' 10', '10' ..]

What code have you written so far?
– Henry
1 hour ago

@ Henry 28 Using list comprehension . y= [ x[I] for I in ange (Len(x)) if (I==0) or ( x[i-1] != x[I]) or ( x[I] != '10')]
– Poka
1 hour ago

3 Answers
3

You may try to use groupby() from itertools:

groupby()

itertools

X= [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10'] from itertools import groupby result = for k, g in groupby(X) : group = list(g) if k == '10' : result.extend(group[:(len(group)+1)/2]) else : result.extend(group) print result

gives:

['10', '00', 'DF', '20', '10', '10']

@ lenik I am new to this group by function . I will check it and come back .
– Poka
1 hour ago

@Poka try for k, g in groupby(X) : print k, list(g) -- this will give you a better understanding what this function does.
– lenik
1 hour ago

for k, g in groupby(X) : print k, list(g)

@ lenik I have also some other consecutive data like '00','00'… the previous solution is replacing them also . But I want only to reduce '10'
– Poka
55 mins ago

@Poka not a problem, check the updated answer in about 15 sec =)
– lenik
52 mins ago

@Poka glad to hear that! if you find any other criteria, just leave a comment here and I'll amend the solution
– lenik
44 mins ago

A pure python approach

ls = dupe = True for item in x: if ls and ls[-1] == item and dupe: dupe = False continue dupe = True ls.append(item) ['10', '00', 'DF', '20', '10', '10']

@RafelC I will check this solution too.
– Poka
47 mins ago

Try using the built-in next() and iter() functions. If you encounter a 10, simply skip over the next element.

next()

iter()

10

raw_data_stream = iter(['10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10']) cleaned_data_stream = for x in raw_data_stream: cleaned_data_stream.append(x) if x == '10': next(raw_data_stream) print(cleaned_data_stream) > ['10', '00', 'DF', '20', '10', '10']

Not a good idea to hardcode '10', would fail for every other possible case
– RafaelC
1 hour ago

'10'

OP stated in a comment that they only want to dedup 10. No where in the question does it specify that they care about any values other than 10 either.
– ben_frankly
25 mins ago

10

Well, thats true..
– RafaelC
20 mins ago

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

YTjnti