海淘手表网站/网页设计是干嘛的
1、数据推断
1.1 业务背景(缺失关键指标)
疫情期间,一家印度外卖餐厅想通过数据分析,数据挖掘提升销量,但是在历史数据中缺少了很重要的一个维度,用餐人数
订单表 菜单表
根据历史订单数据,推断出每个订单的用餐人数
订单表:
-
Order_Number
订单编号 -
Order_Date
订单日期 -
Item_Name
商品名称 -
Quantity
商品数量 -
Product_Price
商品价格 -
Total_products
订单商品数量 -
restaurant_no
连锁店编号
菜单表:
-
商品名称
-
商品价格
1.2 基本思路
确定推断规则:
-
1份主食对应1人 (1:1)
-
1份主菜对应1人 (1:1)
-
1份饮料对应1人(1:1)
-
2份小食对应1人 (2:1)
-
酱料,佐料不做计算(n:0)
-
按照上述原则 上面一餐推断出来的人数为40人——如果专门以喝酒为目的的聚餐,上述规则推断出来的出错概率很高
-
这种极端的例子可以验证在数据中的占比,如果占比不大则无太大影响
1.3 代码实现
1.3.1在现有的产品明细表里重新定义一列字段,作为我们就餐人数判断的产品分类
drop table if exists food_type;
create table food_type as
SELECTitem_name,price,restaurant_id,food_category,
CASEWHEN food_category IN ( '鸡肉类', '羊肉类', '虾类', '咖喱菜', '鱼肉类', '主菜', '芝士菜' ) THEN'主菜' WHEN food_category IN ( '饼', '米饭' ) THEN'主食' WHEN food_category IN ( '饮料', '酒', '酸奶' ) THEN'饮料' WHEN food_category IN ( '酱', '腌菜' ) THEN'佐料' ELSE '小食' END AS food_type
FROMfood_category_table;
1.3.2 重新定义过后产品的分类情况
SELECTcount( DISTINCT item_name ) AS item_num,food_type,food_category
FROMfood_type
GROUP BYfood_type,food_category
ORDER BYfood_type,food_category,item_num DESC;
1.3.3 将交易明细表与我们刚定义的产品明细表进行关联,把原始订单表中每个菜名属于哪个类别确定下来,后续可以根据这类别进行推断
SELECTa.*,b.food_type,b.price
FROMrestaurant_orders aJOIN food_type b ON a.`Item_Name` = b.item_name AND a.Restaurant_Id = b.restaurant_id;
1.3.4 目前一行记录是一笔订单的一种产品的售卖情况,如果一笔订单有若干样产品会分成若干行,我们希望把一笔订单的详情,从多行统一到一行中,同时用我们事先定义的系数计算。
select a.`Order_Number`,a.`Order_Date`,a.restaurant_id,round(sum(a.Quantity*b.price),2) as total_amount, # 计算订单总金额
sum(case when food_type='主食' then a.Quantity*1 else 0 end) as staple_food_count, # 聚合操作,分别统计出每个菜名类别对应的人数
sum(case when food_type='主菜' then a.Quantity*1 else 0 end) as main_course_count,
sum(case when food_type='饮料' then a.Quantity*1 else 0 end) as drink_count,
sum(case when food_type='小食' then a.Quantity*0.5 else 0 end) as snack_count
from restaurant_orders a join food_type b
on a.`Item_Name`=b.item_name and a.Restaurant_Id=b.restaurant_id # 把a表和b表合并
group by a.`Order_Number`,a.`Order_Date`,a.Restaurant_Id; # 合并后分组
1.3.5 比较主食,主菜,饮料,小食中的最大者
select c.*,GREATEST(c.staple_food_count,c.main_course_count,c.drink_count,c.snack_count) as max_count from
(select a.`Order_Number`,a.`Order_Date`,a.restaurant_id,round(sum(a.Quantity*b.price),2) as total_amount,
sum(case when food_type='主食' then a.Quantity*1 else 0 end) as staple_food_count,
sum(case when food_type='主菜' then a.Quantity*1 else 0 end) as main_course_count,
sum(case when food_type='饮料' then a.Quantity*1 else 0 end) as drink_count,
sum(case when food_type='小食' then a.Quantity*0.5 else 0 end) as snack_count
from restaurant_orders a join food_type b
on a.`Item_Name`=b.item_name and a.Restaurant_Id=b.restaurant_id
group by a.`Order_Number`,a.`Order_Date`,a.Restaurant_Id) c;
1.3.6 增加向下取整的逻辑,并且确保最小就餐人数为1
floor 向下取整
select c.*,
GREATEST(floor(GREATEST(c.staple_food_count,c.main_course_count,c.drink_count,c.snack_count)),1) as customer_count from
(select a.`Order_Number`,a.`Order_Date`,a.restaurant_id,round(sum(a.Quantity*b.price),2) as total_amount,
sum(case when food_type='主食' then a.Quantity*1 else 0 end) as staple_food_count,
sum(case when food_type='主菜' then a.Quantity*1 else 0 end) as main_course_count,
sum(case when food_type='饮料' then a.Quantity*1 else 0 end) as drink_count,
sum(case when food_type='小食' then a.Quantity*0.5 else 0 end) as snack_count
from restaurant_orders a join food_type b
on a.`Item_Name`=b.item_name and a.Restaurant_Id=b.restaurant_id
group by a.`Order_Number`,a.`Order_Date`,a.Restaurant_Id) c;
1.3.7 检查一下之前担心的饮料过多导致的推测异常占比
select count(case when drink_count>=5 then Order_Number else NULL end )as outlier_count,count(*)as total_count,round(count(case when drink_count>=5 then Order_Number else null end )/count(*),5)as outlier_rate
from restaurants_orders_customer_count;
2、用户标签
2.1 什么是精细化运营
-
和精细化运营相对的是粗放式运营,也就是在流量红利还在的时候,采用固定几种运营方式,大家看到的东西都是一样的,根据大众最终的实验效果去反馈运营策略。
2.1.1什么是用户标签
每一个用户与企业的商业目的相关较强的特质提取出来,就是用户标签,也叫用户画像
2.1.2 用户标签的作用
用户标签是精细化运营的抓手
发现兴趣,投其所好
发现用户特点,提升用户价值
为业务发展储备标签
2.1.3 用户打标签实操流程
1、明确现有数据
2、确定打标签的业务逻辑,将标签业务逻辑形成文档
3、编写打标签代码
2.1.4 用户标签的来源
属性标签
预测标签
统计标签
2.1.5 项目介绍
本次项目需要做到:
1.设计一套实时的基于明确规则的属性标签体系。
2.根据统计指标和业务判断给用户偏好定性打标,并存储。
3、代码实战
3.1基于时间的事实类标签
Curdate() 获取当前日期
Date_sub(date1,interval xx day) 从指定日期往前推xx天的日期#利用日期函数打上实时事实类标签
drop table if exists mall_customer_realtime_tag;
create table mall_customer_realtime_tag as
select
a.userid,
case when a.last_visit >= DATE_SUB(Curdate(), INTERVAL 30 DAY) then "一个月活跃" else "近一个月不活跃" end as "近一个月活跃标签",
case when a.last_visit >= DATE_SUB(curdate(),INTERVAL 90 DAY) then "近三个月活跃" else "近三个月不活跃" end as "近三个月活跃标签",
case when a.last_visit >= DATE_SUB(curdate(),INTERVAL 180 DAY) then "近六个月活跃" else "近六个月不活跃" end as "近六个月活跃标签",
case when a.last_visit >= DATE_SUB(curdate(),INTERVAL 360 DAY) then "近十二个月活跃" else "近十二个月不活跃" end as "近十二个月活跃标签"
from
(select
userid, max(date) as last_visit
from mall_customers_tld_header
GROUP BY userid) a
3.2 基于阈值的统计类标签
3.2.1 年龄标签 age_group_tag
select CustomerID,case when Age <=18 then '青少年'when Age between 19 and 24 then '青年'when Age between 25 and 35 then '中青年'when Age between 36 and 45 then '中年'when Age between 46 and 55 then '中老年'else '老年' end as age_group_tag
from mall_customers;
3.2.2 收入标签 income_tag
select CustomerID,case when Annual_Income>=50 then '高收入人群'else '普通收入人群' end as income_tag
from mall_customers;
3.2.3 消费能力标签 spend_score
select CustomerID,case when Spending_Score>=50 then '高消费人群'when Spending_Score between 11 and 49 then '普通消费人群'when Spending_Score =0 then '无消费人群'else '低消费人群' end as spending_tag
from mall_customers;
3.2.4 访问次数标签 freq_tag
select CustomerID,case when visits_p12>=12 then '高频访客'when visits_p12 between 6 and 11 then '中频访客'else '低频访客' end as freq_tag
from mall_customers;
3.2.5 偏好标签 daypart_preference_tag weekday_preference_tag accompany_preference_tag target_preference_tag
select CustomerID,case when morning_visits_p12/visits_p12>0.5 then '早间访客'when afternoon_visit_p12/visits_p12>0.5 then '午间访客'when evening_visits_p12/visits_p12>0.5 then '晚间访客'else null end as daypart_preference_tag
from mall_customers;select CustomerID,case when weekday_visits_p12/visits_p12>0.5 then '周中访客'when weekend_visits_p12/visits_p12>0.5 then '周末访客'else null end as weekday_preference_tag
from mall_customers;select CustomerID,case when with_kids_visits_p12/visits_p12>0.5 then '热衷带娃'when with_couple_visits_p12/visits_p12>0.5 then '热衷恋爱'else null end as accompany_preference_tag
from mall_customers;select CustomerID,case when clothing_spending_score/Spending_Score>0.5 then '时尚达人'when foodbeverage_spending_score/Spending_Score>0.5 then '饮食男女'when entertainment_spending_score/Spending_Score>0.5 then '娱乐至上'when kids_spending_score/Spending_Score>0.5 then '鸡娃狂魔'else null end as target_preference_tag
from mall_customers;
3.2.6 新客老客 history_tag
select CustomerID,case when first_visit_till_today<=30 then '新顾客'when first_visit_till_today>=180 then '老顾客'else '普通顾客' end as history_tag
from mall_customers;
3.3 使用标签圈人
3.3.1 将实时标签,统计类标签汇总 合并两个表
drop table if exists mall_customer_all_tags;
create table mall_customer_all_tags as
select a.*,b.age_group_tag,b.income_tag,b.spending_tag,b.daypart_preference_tag,b.weekday_preference_tag,b.accompany_preference_tag,b.target_preference_tag,b.history_tag
from mall_customer_realtime_tag a
left join mall_customer_aggregation_tags b
on a.UserID=b.CustomerID;
3.3.2 找出最有消费潜力的用户
select distinct UserID as customer_base
from mall_customer_all_tags
where income_tag='高收入人群' and spending_tag='低消费人群';
3.3.3 找出高收入人群中高消费人群的消费偏好
# 找出高收入人群中高消费人群的消费偏好
select group_concat(distinct UserID separator ',') as customers,target_preference_tag
from mall_customer_all_tags
where income_tag='高收入人群' and spending_tag='高消费人群'
group by target_preference_tag;
3.3.4 提取近三个月有过交易的普通消费人群
select distinct UserID
from mall_customer_all_tags
where 近三个月活跃标签='近三个月活跃' and spending_tag='普通消费人群';