Recursive python code to analyze ways to cut a rope into ever smaller pieces

cut_rope
generated by myself, in Adobe Illustrator

In a recent coding challenge one question was to get some statistics on the length of rope left after a certain number of cuts. Starting with rope length N, the rope is cut into 3 random but integer length pieces; if this is impossible no more cuts are made. The largest piece resulting is then subject to 2 more cuts. Repeat this T times. The last remaining longest piece is recorded. The statistical questions were on the conditional probability of a certain length if at least at a certain other length.

This post is about the code I wrote for this challenge, and what I would improve if I were to rewrite it.

I implemented a recursive algorithm to generate lots of data and put it into a list. Then I imported the list as a pandas data frame for the statistical analysis. Here is the core function (with debugging print statements removed):

def cut(N, T):
	bit = N
	T_counter = T
	if T_counter >= 0:
		x = random.randrange(1, N-1)
		y = random.randrange(x, N)
		peices = [ x, y - x, N -y]
		bit = max(peices)
		T_counter -= 1
		if bit > 3:
			return cut(bit, T_counter)
		else:
			S_list.append(bit)
	else:
		S_list.append(bit)

There are a couple parts that I would change if I were to rewrite without time constraints. If you are a very careful reader, you would notice that the function adds to a list thats not defined within the function… this is sloppy. I should make the function return a list, then call the function as so:

list_to_analyze = cut(N, T)

I had the code save to file a .csv made from the list of end rope lengths, this was only necessary to backup my data if my ancient laptop were to crash. I could have just turned the list into a pandas data frame and saved a few lines.

I didn’t have time to learn how to use bayespy, but that would have made the statistical portion of the code much more readable.

Finally, making the N and T’s user defined, rather than hardcoded would be a nice touch, say with sys argv or even user defined input.

Well, thats it for now, and if you want to see the whole script, it follows.

~Rafael

import matplotlib.pyplot as plt
import pandas as pd
import math
import csv
import math
import random

S_list = []
def cut(N, T):
	bit = N
	T_counter = T
	if T_counter >= 0:
		x = random.randrange(1, N-1)
		#print 'x = {} '.format(x)
		y = random.randrange(x, N)
		#print 'y = {} '.format(y)
		peices = [ x, y - x, N -y]
		bit = max(peices)
		#print 'bit = {} '.format(bit)
		T_counter -= 1
		if bit > 3:
			return cut(bit, T_counter)
		else:
			S_list.append(bit)
			#print 'N too short end. S = {} added to S_list \n'.format(bit)
	else:
		S_list.append(bit)
		#print 'T end. S = {} added to S_list \n'.format(bit)

iter = 100000000
while iter >=0:
	cut(1024, 10)
	iter -= 1

#print 'S_list = {}'.format(S_list)

csvfile = "N1024_T10.csv"
with open(csvfile, "w") as output:
	writer = csv.writer(output, lineterminator='\n')
	for val in S_list:
	        writer.writerow([val])

S_list = []
iter = 100000000
while iter >=0:
	cut(64, 5)
	iter -= 1

#print 'S_list = {}'.format(S_list)

csvfile = "N64_T5.csv"
with open(csvfile, "w") as output:
	writer = csv.writer(output, lineterminator='\n')
	for val in S_list:
	        writer.writerow([val])

# name data colums and read data
name_64 = ['S_64']
name_1024 = ['S_1024']
df_64   = pd.read_csv('N64_T5.csv', names = name_64)
df_1024 = pd.read_csv('N1024_T10.csv', names = name_1024) 

# simple extraction of stats
print df_64.describe()
print '\n'
print df_1024.describe()
print '\n'

# count instances for prob
dens_64_tmp = df_64['S_64'].value_counts(normalize=True)
dens_1024_tmp = df_1024['S_1024'].value_counts(normalize=True)

# filter data for conditional probs
dens_64 = dens_64_tmp.to_frame().reset_index()
dens_1024 = dens_1024_tmp.to_frame().reset_index()
#
dens_64_fil_8 = dens_64['index'] > 8
dens_64_fil_4 = dens_64['index'] > 4

dens_1024_fil_12 = dens_1024['index'] > 12
dens_1024_fil_6  = dens_1024['index'] > 6

dens_64_over_8 = dens_64[ dens_64_fil_8 ]
dens_64_over_4 = dens_64[ dens_64_fil_4 ]

dens_1024_over_12 = dens_1024[ dens_1024_fil_12 ]
dens_1024_over_6 =  dens_1024[ dens_1024_fil_6 ]

# calc probability of everything
dens_64_sum_over_8 = dens_64_over_8['S_64'].sum()
dens_64_sum_over_4 = dens_64_over_4['S_64'].sum()
prob_over_8_if_over_4 = dens_64_sum_over_8 / dens_64_sum_over_4 

dens_1024_sum_over_12 = dens_1024_over_12['S_1024'].sum()
dens_1024_sum_over_6  = dens_1024_over_6[ 'S_1024'].sum()
prob_over_12_if_over_6 = dens_1024_sum_over_12 / dens_1024_sum_over_6 

# print conditional stats
print 'dens_64_sum_over_8 = \n{} \n'.format(dens_64_sum_over_8)
print 'dens_64_sum_over_4 = \n{} \n'.format(dens_64_sum_over_4)
print 'prob_over_8_if_over_4 = {} \n'.format(prob_over_8_if_over_4)

print 'dens_1024_sum_over_12 \n{}\n'.format(dens_1024_sum_over_12)
print 'dens_1024_sum_over_6 \n{} \n'.format(dens_1024_sum_over_6)
print 'prob_over_12_if_over_6 = {} \n'.format(prob_over_12_if_over_6)

August 28, 2018 (0)


Leave a Reply

Your email address will not be published. Required fields are marked *