Activity #2

Use ellmer to interact with the an LLM (Anthropic, in this case) to generate feedback.

Tooling

prompt.R:

function (question, rubric, answer) 
{
    chat <- chat_anthropic(system_prompt = paste("You are a helpful course instructor teaching a course on data science with the R programming language and the tidyverse and tidymodels suite of packages. You like to give succinct but precise feedback.", 
        glue_collapse(read_lines("system-prompt/pipes.qmd"))))
    chat$chat(glue("Carefully read the {question} and the {rubric_detailed}, then evaluate {answer} against the {rubric_detailed} to provide feedback. \n      Provide feedback in an output section named **Feedback:**. \n      Format the feedback as bullet points: Each bullet point should first state the rubric item text from {rubric}, and then provide one sentence explaining whether the {answer} meets the {rubric} item.\n      Do not give away the correct answer in the feedback."))
}

Note: The system prompt points to a file that has the contents of the pipes chapter of the Tidyverse style guide.

Example: Tidying

Below is a question on tidying.

For this question you will work with inflation data from various countries in the world over the last 30 years. The dataset is called country-inflation.csv and it’s in your data folder.

Reshape (pivot) country_inflation such that each row represents a country/year combination, with columns country, year, and annual_inflation. Then, display the resulting data frame and state how many rows and columns it has.


Code for constructing function inputs
question <- read_lines("questions/tidy-country-inflation-pivot-longer/tidy-country-inflation-pivot-longer-Q.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

rubric <- read_lines("questions/tidy-country-inflation-pivot-longer/tidy-country-inflation-pivot-longer-R.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

rubric_detailed <- read_lines("questions/tidy-country-inflation-pivot-longer/tidy-country-inflation-pivot-longer-RD.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

bad_answer <- read_lines("questions/tidy-country-inflation-pivot-longer/tidy-country-inflation-pivot-longer-sample-answer-bad.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

good_answer <- read_lines("questions/tidy-country-inflation-pivot-longer/tidy-country-inflation-pivot-longer-sample-answer-good.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

Below is what we might consider a bad answer to this question.

country_inflation <- read_csv("data/country-inflation.csv")

There are 3 columns and 1178 rows in this new data frame.

df <- country_inflation |> 
    pivot_longer(
    cols = -country,names_to = "year", 
    values_to = "inflationrate"
               )
print(df)
# A tibble: 1,178 × 3
   country   year  inflationrate
   <chr>     <chr>         <dbl>
 1 Australia 1993          1.75 
 2 Australia 1994          1.97 
 3 Australia 1995          4.63 
 4 Australia 1996          2.62 
 5 Australia 1997          0.225
 6 Australia 1998          0.860
 7 Australia 1999          1.48 
 8 Australia 2000          4.46 
 9 Australia 2001          4.41 
10 Australia 2002          2.98 
# ℹ 1,168 more rows

Provide feedback to the answer above using the following rubric.

  • Code uses pivot_longer().

  • Code names the data frame something short and informative.

  • Code names the year variable year and the inflation variable annual_inflation.

  • Code transforms the year variable to numeric inside pivot_longer().

  • Output displays country_inflation_longer.

  • Output has 3 columns: country, year, and annual_inflation.

  • Narrative states the correct numbers of rows and columns.

  • Code style and readability: Line breaks after each |>, proper indentation, spaces around = signs if they are present, and spaces after commas if they are present.

prompt(
  question = question, 
  rubric = rubric,
  answer = bad_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

- **Code uses `pivot_longer()`**: ✓ The code correctly uses `pivot_longer()` to
reshape the data from wide to long format.

- **Code names the data frame something short and informative**: ✗ The data 
frame is named `df` which is too generic and not informative about what the 
data contains.

- **Code names the year variable `year` and the inflation variable 
`annual_inflation`**: ✗ While the year variable is correctly named `year`, the 
inflation variable is named `inflationrate` instead of the required 
`annual_inflation`.

- **Code transforms the year variable to numeric inside `pivot_longer()`**: ✗ 
The code is missing the `names_transform = as.numeric` argument that was 
specifically required to convert the year variable to numeric type.

- **Output displays `country_inflation_longer`**: ✗ The output shows `df` 
instead of the expected `country_inflation_longer` data frame name.

- **Output has 3 columns: `country`, `year`, and `annual_inflation`**: ✗ While 
there are 3 columns, the inflation column is named `inflationrate` rather than 
`annual_inflation`.

- **Narrative states the correct numbers of rows and columns**: ✗ The narrative
states "1178 rows and 3" but should state "1,178 rows and 3 columns" for 
complete accuracy.

- **Code style and readability**: ✗ The code has inconsistent spacing (missing 
spaces around `=` in `names_to = "year"` and `values_to = "inflationrate"`), 
and the closing parenthesis alignment could be improved.

Provide feedback to the same answer from the previous tab.

prompt(
  question = question, 
  rubric = rubric,
  answer = bad_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

• **Code uses `pivot_longer()`** - ✓ The code correctly uses `pivot_longer()` 
to reshape the data from wide to long format.

• **Code names the data frame something short and informative** - ✗ The data 
frame is named `df` which is not informative about what the data contains.

• **Code names the year variable `year` and the inflation variable 
`annual_inflation`** - ✗ While the year variable is correctly named `year`, the
inflation variable is named `inflationrate` instead of the required 
`annual_inflation`.

• **Code transforms the year variable to numeric inside `pivot_longer()`** - ✗ 
The code is missing the `names_transform = as.numeric` argument that was 
specifically required to convert the year variable to numeric during pivoting.

• **Output displays `country_inflation_longer`** - ✗ The output shows a data 
frame named `df` rather than the expected `country_inflation_longer`.

• **Output has 3 columns: `country`, `year`, and `annual_inflation`** - ✗ While
there are 3 columns, the third column is named `inflationrate` rather than 
`annual_inflation`.

• **Narrative states the correct numbers of rows and columns** - ✗ The 
narrative states 1178 rows and 3 columns but places this information before the
code chunk rather than after displaying the results.

• **Code style and readability** - ✗ The code has inconsistent spacing (missing
spaces after commas in some places) and could benefit from better formatting 
alignment within the `pivot_longer()` function.

Below is what we might consider a good answer to this question.

country_inflation <- read_csv("data/country-inflation.csv")

The data frame, “yearly_inflation,” has 1,178 rows and three columns.

yearly_inflation <- country_inflation |>
  pivot_longer(
    cols = `1993`:`2023`,
    names_to = "year",
    names_transform = as.numeric,
    values_to = "annual_inflation"
  )

yearly_inflation
# A tibble: 1,178 × 3
   country    year annual_inflation
   <chr>     <dbl>            <dbl>
 1 Australia  1993            1.75 
 2 Australia  1994            1.97 
 3 Australia  1995            4.63 
 4 Australia  1996            2.62 
 5 Australia  1997            0.225
 6 Australia  1998            0.860
 7 Australia  1999            1.48 
 8 Australia  2000            4.46 
 9 Australia  2001            4.41 
10 Australia  2002            2.98 
# ℹ 1,168 more rows

Provide feedback to the answer above.

prompt(
  question = question, 
  rubric = rubric,
  answer = good_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

• **Code uses `pivot_longer()`:** ✓ The code correctly uses `pivot_longer()` to
reshape the data from wide to long format.

• **Code names the data frame something short and informative:** ✓ The data 
frame is named `yearly_inflation`, which is both concise and descriptive of its
contents.

• **Code names the year variable `year` and the inflation variable 
`annual_inflation`:** ✓ The code properly uses `names_to = "year"` and 
`values_to = "annual_inflation"` to create the required column names.

• **Code transforms the year variable to numeric inside `pivot_longer()`:** ✓ 
The code includes `names_transform = as.numeric` to convert the year variable 
to numeric type during the pivot operation.

• **Output displays `country_inflation_longer`:** ✗ The rubric expects the 
output to display a data frame named `country_inflation_longer`, but the code 
creates and displays `yearly_inflation` instead.

• **Output has 3 columns: `country`, `year`, and `annual_inflation`:** ✓ The 
resulting data frame contains the three required columns with the correct 
names.

• **Narrative states the correct numbers of rows and columns:** ✓ The narrative
correctly states that the data frame has 1,178 rows and 3 columns.

• **Code style and readability:** ✓ The code follows proper tidyverse style 
with line breaks after the pipe operator, appropriate indentation, and proper 
spacing around operators and after commas.

Example: Modeling

Below is a question on modeling:

For this part of the analysis, the data set that will be used is called births14 , and it is included in the openintro package you loaded at the beginning of the assignment. Note: this question relies on an earlier part of the assignment, where the births14 dataset is filtered to remove NA values and assigned the name births14_habitgiven.

  1. Fit a linear model that predicts weight from habit and save the model object. Then, provide the tidy summary output.

  2. Write the estimated least squares regression line below using proper notation.

  3. Interpret the intercept in the context of the data and the research question. Is the intercept meaningful in this context? Why or why not?

  4. Interpret the slope in the context of the data and the research question.


Code for constructing function inputs
question <- read_lines("questions/model-birth-simple-linear/model-births-simple-linear-Q.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

rubric <- read_lines("questions/model-birth-simple-linear/model-births-simple-linear-R.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

rubric_detailed <- read_lines("questions/model-birth-simple-linear/model-births-simple-linear-RD.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

bad_answer <- read_lines("questions/model-birth-simple-linear/model-births-simple-linear-sample-answer-bad.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

good_answer <- read_lines("questions/model-birth-simple-linear/model-births-simple-linear-sample-answer-bad.qmd") |>
  glue_collapse() |>
  str_remove("---.*?---")

Below is what we might consider a bad answer to this question.

births14_habitgiven <- births14 |>
  filter(!is.na(habit))
wt_habit_fit <- linear_reg() %>% fit(weight~habit,data=births14_habitgiven)

wt_habit_fit
parsnip model object


Call:
stats::lm(formula = weight ~ habit, data = data)

Coefficients:
(Intercept)  habitsmoker  
     7.2699      -0.5927  

\[ y = 7.2 + 0.6\times smoker_factor \]

  1. When the habit factor value = 0, and because the LSRL line is negative, and the median birth weight for smoking mothers is less than non-smoking mothers. The weight of mothers born to non-smoker mothers is always 7.27 pounds.

  2. A baby born to a smoker is 0.59 pounds lighter than a baby with a non-smoker mother.

Provide feedback to the answer above using the following rubric.

  • Part (a) - Code correctly fits linear regression model with the correct response and predictor variables using tidymodels functions.

  • Part (a) - Regression summary is displayed with tidy().

  • Part (b) - Estimated least squares regression line is correct.

  • Part (c) - Narrative provides correct interpretation of the intercept in context of the data.

  • Part (c) - Narrative states that the intercept is meaningful and provides correct justification.

  • Part (d) - Narrative provides correct interpretation of the slope in context of the data.

  • Code smell, style, and readability: The code should follow the Tidyverse style guide.

prompt(
  question = question, 
  rubric = rubric,
  answer = bad_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

- **Part (a) - Code correctly fits linear regression model with the correct 
response and predictor variables using tidymodels functions.** Your code 
correctly fits a linear regression model with weight as the response and habit 
as the predictor using the `linear_reg()` and `fit()` functions from 
tidymodels.

- **Part (a) - Regression summary is displayed with `tidy()`.** You need to 
display the regression summary using the `tidy()` function - currently you're 
only showing the model object itself.

- **Part (b) - Estimated least squares regression line is correct.** Your 
equation has several issues: the coefficient values don't match what would come
from the actual model output, you're missing the "hat" notation for predicted 
values, and the variable name should match the actual coding of the habit 
variable.

- **Part (c) - Narrative provides correct interpretation of the intercept in 
context of the data.** Your interpretation is confusing and contains errors - 
you mention "habit factor value = 0" and "weight of mothers" instead of 
focusing on baby weights, and you state the weight is "always" 7.27 pounds 
rather than using language about expectations or averages.

- **Part (c) - Narrative states that the intercept is meaningful and provides 
correct justification.** You haven't addressed whether the intercept is 
meaningful in this context or provided appropriate justification for your 
conclusion.

- **Part (d) - Narrative provides correct interpretation of the slope in 
context of the data.** Your interpretation is close but needs to include 
language about expectations or averages rather than making a definitive 
statement about individual babies.

- **Code smell, style, and readability: The code should follow the Tidyverse 
style guide.** Your code needs improvement in spacing - there should be spaces 
around the `=` sign and after commas in function calls (e.g., `data = 
births14_habitgiven`).

Below is what we might consider a good answer to this question.

births14_habitgiven <- births14 |>
  filter(!is.na(habit))
wt_habit_fit <- linear_reg() |>
  fit(weight ~ habit, data = births14_habitgiven)

tidy(wt_habit_fit)
# A tibble: 2 × 5
  term        estimate std.error statistic    p.value
  <chr>          <dbl>     <dbl>     <dbl>      <dbl>
1 (Intercept)    7.27     0.0435    167.   0         
2 habitsmoker   -0.593    0.128      -4.65 0.00000382

\[ \widehat{weight} = 7.27 - 0.59 \times smoker \]

  1. The expected weight of babies born to non-smoker mothers is 7.27 pounds. This intercept is meaningful because it is plausible that women in the sample are non-smokers, and 7.27 lbs is a possible weight of a newborn baby.

  2. Babies born to smokers are, on average, about 0.59 pounds lighter than babies with non-smoker mothers.

Provide feedback to the answer above.

prompt(
  question = question, 
  rubric = rubric,
  answer = good_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

• **Part (a) - Code correctly fits linear regression model with the correct 
response and predictor variables using tidymodels functions.** Your code 
correctly uses `linear_reg()` and `fit()` with the appropriate variables, but 
you need to add proper spacing around the `=` sign in 
`data=births14_habitgiven`.

• **Part (a) - Regression summary is displayed with `tidy()`.** This 
requirement is not met - you displayed the raw model output instead of using 
the `tidy()` function to create a clean summary table.

• **Part (b) - Estimated least squares regression line is correct.** This does 
not meet the requirement - your equation has incorrect coefficient values and 
should use proper notation with a hat symbol over y to indicate predicted 
values.

• **Part (c) - Narrative provides correct interpretation of the intercept in 
context of the data.** This does not meet the requirement - your interpretation
is confusing and doesn't clearly explain what the intercept represents in terms
of expected baby weight for non-smoking mothers.

• **Part (c) - Narrative states that the intercept is meaningful and provides 
correct justification.** This requirement is not met - you don't clearly state 
whether the intercept is meaningful or provide proper justification for your 
conclusion.

• **Part (d) - Narrative provides correct interpretation of the slope in 
context of the data.** This partially meets the requirement but should include 
language about "on average" or "expected" rather than stating it as an absolute
fact about individual babies.

• **Code smell, style, and readability: The code should follow the Tidyverse 
style guide.** This partially meets the requirement - while you use the base 
pipe correctly, you need to add spaces around the equals sign and after the 
comma in your `fit()` function.

Provide feedback to the same answer from the previous tab.

prompt(
  question = question, 
  rubric = rubric,
  answer = good_answer
)
Using model = "claude-sonnet-4-20250514".
**Feedback:**

• **Part (a) - Code correctly fits linear regression model with the correct 
response and predictor variables using tidymodels functions.** Your code 
correctly uses `linear_reg()` and `fit()` with the appropriate variables, but 
you need to add a space after the comma in your function call.

• **Part (a) - Regression summary is displayed with `tidy()`.** This 
requirement is not met - you need to display the tidy summary output using the 
`tidy()` function on your fitted model object.

• **Part (b) - Estimated least squares regression line is correct.** This does 
not meet the requirement - you need to use the actual coefficient values from 
your model output and include proper notation (such as $\widehat{y}$ for the 
predicted value).

• **Part (c) - Narrative provides correct interpretation of the intercept in 
context of the data.** This does not meet the requirement - your interpretation
is unclear and contains errors; you need to explain what the intercept 
represents in terms of expected baby weight for a specific group, using 
language like "on average" or "expected."

• **Part (c) - Narrative states that the intercept is meaningful and provides 
correct justification.** This requirement is not met - you need to clearly 
state whether the intercept is meaningful and provide a proper justification 
based on the plausibility of the baseline condition.

• **Part (d) - Narrative provides correct interpretation of the slope in 
context of the data.** This partially meets the requirement but needs 
improvement - you should use language indicating this is about 
expectations/averages rather than individual cases, and ensure your numerical 
values match the model output.

• **Code smell, style, and readability: The code should follow the Tidyverse 
style guide.** Your code mostly follows the style guide but needs a space after
the comma in your `fit()` function call.