skip to page content SBU
95-869 Big Data and Large Scale Computing
Spring 2022


Course Policies


  • All devices such as laptops, cell phones, noisy PDAs, etc. should be turned off for the duration of the lectures and the recitations, because they may distract other fellow students.
  • Please come to all lectures on time and leave on time, again so that there are no distractions to the classmates.


Students are expected to have the following background:
  • Basic knowledge of data analysis and machine learning concepts; having taken:
  • Working knowledge of linear algebra (e.g. matrix-matrix multiplication, eigenvectors, matrix rank, etc.)
  • Programming skills at a level sufficient to write a reasonably non-trivial computer program in Python.


  • Assignments are due at * 11:59PM * on the due date.
  • The due date of assignments are posted at the assignments page.
  • Assignments will be posted on Canvas.
  • Students should submit the assignments electronically only via Canvas.

  • Important Note: As we reuse problem set questions, covered by papers and webpages, we expect the students not to copy, refer to, or look at the solutions in preparing their answers. Since this is a graduate-level class, we expect students to want to learn and not google for answers. The purpose of problem sets in this class is to help you think about the material, not just give us the right answers. Therefore, please restrict attention to the books mentioned on the front page when solving problems on the problem set. If you do happen to use other material, it must be acknowledged clearly with a citation on the submitted solution.

    Academic integrity

    All students are expected to comply with CMU's policy on academic integrity. Please read the policy and make sure you have a complete understanding of it.


    You are encouraged to discuss homework problems with your fellow students. However, the work you submit must be your own. You must acknowledge in your submission any help received on your assignments. That is, you must include a comment in your homework submission that clearly states the name of the student, book, or online reference from which you received assistance.

    Submissions that fail to properly acknowledge any help from other students or non-class sources will receive NO credit. Copied work will receive NO credit. Any and all violations will be reported to the Heinz College administration and may appear in the student's transcript.

    Questions and requests

    • You should use Piazza for all your questions about the assignments and the course material. Instructor and TA(s) will do their best to answer your questions timely.
    • Regrade requests should be done in writing/email,
      • within 2 days after graded assignments are distributed
      • to the TA that graded the question, and specifying
        • the question under dispute (e.g., 'HW1-Q.2.b')
        • the extra points requested (e.g., '2 points out of 5')
        • and the justification (e.g., 'I forgot to divide by variance, but the rest of my answer was correct')
      • In the remote case there is no satisfactory resolution, please contact the instructor.

    Late policy 

    • No delay penalties, for medical/family/etc. emergencies (bring written documentation, like doctor's note).
    • Each student is granted '3 slip days' total for the whole course duration, to accommodate for coinciding deadlines/interviews/etc. That is, no questions asked, if the total delay is 3 days or less.
      • You can use the extension on any assignment during the course. For instance, you can hand in one assignment 3 days late, or 3 different assignments 1 day late each.
      • Late days are rounded up to the nearest integer. For example, a submission that is 4 hours late will count as 1 day late.
      • After you have used up your slip days, any assignment handed in late will be marked off 25% per day of delay.
    • To use slip days:
      • upload your homework on Canvas to mark the time of your submission.
      • No emails to TA are necessary -- we will use the latest upload time as the submission time.