Remove only half of specific adjacent duplicates in python list


Remove only half of specific adjacent duplicates in python list
I have a tool which is outputting some data . It is known that whenever '10' comes in the data it is added with extra '10' I.e new data becomes ... '10', '10', . Sometimes there can be 4 '10' in consecutive series which means that there is actually 2 '10'.
While reading the data I am trying to remove the duplicates . Till now I have learnt how to remove duplicates if only two adjacent duplicates are found but at the same time if even number of duplicates are found , I want to return half of the duplicates .
x = [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10', ....]
Expected output
[ '10', '00' , 'DF', ' 20', ' 10', '10' ..]
@ Henry 28 Using list comprehension . y= [ x[I] for I in ange (Len(x)) if (I==0) or ( x[i-1] != x[I]) or ( x[I] != '10')]
– Poka
1 hour ago
3 Answers
3
You may try to use groupby()
from itertools
:
groupby()
itertools
X= [ '10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10']
from itertools import groupby
result =
for k, g in groupby(X) :
group = list(g)
if k == '10' :
result.extend(group[:(len(group)+1)/2])
else :
result.extend(group)
print result
gives:
['10', '00', 'DF', '20', '10', '10']
@ lenik I am new to this group by function . I will check it and come back .
– Poka
1 hour ago
@Poka try
for k, g in groupby(X) : print k, list(g)
-- this will give you a better understanding what this function does.– lenik
1 hour ago
for k, g in groupby(X) : print k, list(g)
@ lenik I have also some other consecutive data like '00','00'… the previous solution is replacing them also . But I want only to reduce '10'
– Poka
55 mins ago
@Poka not a problem, check the updated answer in about 15 sec =)
– lenik
52 mins ago
@Poka glad to hear that! if you find any other criteria, just leave a comment here and I'll amend the solution
– lenik
44 mins ago
A pure python approach
ls =
dupe = True
for item in x:
if ls and ls[-1] == item and dupe:
dupe = False
continue
dupe = True
ls.append(item)
['10', '00', 'DF', '20', '10', '10']
@RafelC I will check this solution too.
– Poka
47 mins ago
Try using the built-in next()
and iter()
functions. If you encounter a 10
, simply skip over the next element.
next()
iter()
10
raw_data_stream = iter(['10', '10', '00', 'DF', '20' ,'10' ,'10' ,'10' ,'10'])
cleaned_data_stream =
for x in raw_data_stream:
cleaned_data_stream.append(x)
if x == '10':
next(raw_data_stream)
print(cleaned_data_stream)
> ['10', '00', 'DF', '20', '10', '10']
Not a good idea to hardcode
'10'
, would fail for every other possible case– RafaelC
1 hour ago
'10'
OP stated in a comment that they only want to dedup
10
. No where in the question does it specify that they care about any values other than 10
either.– ben_frankly
25 mins ago
10
10
Well, thats true..
– RafaelC
20 mins ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
What code have you written so far?
– Henry
1 hour ago