Remove only half of specific adjacent duplicates in python list

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP

Remove only half of specific adjacent duplicates in python list



I have a tool which is outputting some data . It is known that whenever '10' comes in the data it is added with extra '10' I.e new data becomes ... '10', '10', . Sometimes there can be 4 '10' in consecutive series which means that there is actually 2 '10'.



While reading the data I am trying to remove the duplicates . Till now I have learnt how to remove duplicates if only two adjacent duplicates are found but at the same time if even number of duplicates are found , I want to return half of the duplicates .


x = [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10', ....]



Expected output


[ '10', '00' , 'DF', ' 20', ' 10', '10' ..]





What code have you written so far?
– Henry
1 hour ago





@ Henry 28 Using list comprehension . y= [ x[I] for I in ange (Len(x)) if (I==0) or ( x[i-1] != x[I]) or ( x[I] != '10')]
– Poka
1 hour ago




3 Answers
3



You may try to use groupby() from itertools:


groupby()


itertools


X= [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10']

from itertools import groupby

result =
for k, g in groupby(X) :
group = list(g)
if k == '10' :
result.extend(group[:(len(group)+1)/2])
else :
result.extend(group)
print result



gives:


['10', '00', 'DF', '20', '10', '10']





@ lenik I am new to this group by function . I will check it and come back .
– Poka
1 hour ago






@Poka try for k, g in groupby(X) : print k, list(g) -- this will give you a better understanding what this function does.
– lenik
1 hour ago


for k, g in groupby(X) : print k, list(g)





@ lenik I have also some other consecutive data like '00','00'… the previous solution is replacing them also . But I want only to reduce '10'
– Poka
55 mins ago





@Poka not a problem, check the updated answer in about 15 sec =)
– lenik
52 mins ago





@Poka glad to hear that! if you find any other criteria, just leave a comment here and I'll amend the solution
– lenik
44 mins ago



A pure python approach


ls =
dupe = True
for item in x:
if ls and ls[-1] == item and dupe:
dupe = False
continue
dupe = True
ls.append(item)

['10', '00', 'DF', '20', '10', '10']





@RafelC I will check this solution too.
– Poka
47 mins ago



Try using the built-in next() and iter() functions. If you encounter a 10, simply skip over the next element.


next()


iter()


10


raw_data_stream = iter(['10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10'])

cleaned_data_stream =
for x in raw_data_stream:
cleaned_data_stream.append(x)
if x == '10':
next(raw_data_stream)

print(cleaned_data_stream)


> ['10', '00', 'DF', '20', '10', '10']





Not a good idea to hardcode '10', would fail for every other possible case
– RafaelC
1 hour ago



'10'





OP stated in a comment that they only want to dedup 10. No where in the question does it specify that they care about any values other than 10 either.
– ben_frankly
25 mins ago



10


10





Well, thats true..
– RafaelC
20 mins ago






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

Executable numpy error

Trying to Print Gridster Items to PDF without overlapping contents

Hystrix command on request collapser fallback